Re: [R] Return
It depends on which return you want. See the R computation section of A tale of two returns. http://www.portfolioprobe.com/2010/10/04/a-tale-of-two-returns/ Pat On 05/07/2012 05:58, Akhil dua wrote: Hello Every one I have data on Stock prices and I want to calculate the return on all the stocks and then replace all the stock prices with the returns can any one tell me how to do My data is in the format given below Date Stock1 Stock2 Stock3 01/01/20001 2 3 01/02/20005 6 7 01/03/20001 2 3 01/04/20005 6 7 Thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Patrick Burns pbu...@pburns.seanet.com twitter: @portfolioprobe http://www.portfolioprobe.com/blog http://www.burns-stat.com (home of 'Some hints for the R beginner' and 'The R Inferno') __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help! Please recommend good books/resources on visualizing data and understanding multivariate relations...
I'm a fan of Ggobi--which works well with or without R--but I'm not sure how it handles enormous data sets. I second the Tufte recommendation, and add: Interactive and Dynamic Graphics for Data Analysis. -- View this message in context: http://r.789695.n4.nabble.com/Help-Please-recommend-good-books-resources-on-visualizing-data-and-understanding-multivariate-relati-tp4635344p4635456.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Return on Stock Market
Hello Every one I have data on Stock prices and I want to calculate the return on all the stocks and then replace all the stock prices with the returns can any one tell me how to do My data is in the format given below Date Stock1 Stock2 Stock3 01/01/20001 2 3 01/02/20005 6 7 01/03/20001 2 3 01/04/20005 6 7 Thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Binary Quadratic Opt?
Thanks Petr for the reply. Let me do implementation and see how how goes. Enjoy your vacation. Rest fine Khris On Jul 4, 2012, at 8:25 PM, Petr Savicky [via R] wrote: On Mon, Jul 02, 2012 at 06:11:37AM -0700, khris wrote: Hi, Petr, Hi Khris: If i understand the problem correctly, you have a list of (x,y) coordinates, where some sensor is located, but you do not know, which sensor is there. The database contains data for each sensor identified in some way, but you do not know the mapping between sensor identifiers from the database and the (x,y) coordinates. Is this correct? Yes. So I modelled the problem as inexact match between 2 Graphs. Since the best package on Graphs i.e. iGraph does not have any function for Graph matching I think, the problem is close to http://en.wikipedia.org/wiki/Graph_isomorphism You have estimates of the distances between the sensors using identifiers from the database. So, you know, which pairs of sensors are close. This is one graph. The other graph is the graph of closeness between the known (x,y) coordinates. You want to find a mapping between the vertices of these two graphs, which preserves edges. Yes, I agree the problem is more into Graph theoretic domain to be more precise inexact graph matching whose generalization is the Graph Isomorphism problem. The problem is more general than Graph Isomorphism. Let me define the problem more formally. We have 2 weighted undirected graphs. In one graph I know the distance of every vertex from every other vertex whereas in another graph I know only which vertices are close to a given vertex. So I know the neighboring vertices given a vertex. So the distance matrix of other Graph is incompletely known. So the question is can I find the best alignment between the 2 graphs. Ex:- G1 is know the complete distance matrix. For G2, if there are four vertices let's say (v1, v2, v3 v4) the I know edge weight (v1,v2) and (v1,v3) but have no information of edge weight(v1,v4). Similarly I know about (v2,v3) but no information about edge weights (v2,v4) or (v3,v4). So I was thinking of not to model it as general inexact Graph matching problem for then the complexity n^4. It seems the best way to model the solution is to consider only edges with are at distance of 1 unit i.e. closest edge from every vertex and not every edge from the given vertex. This will bring down the complexity from n^4 to 6*6*n^2 assuming every vertex has atmost 6 neighboring vertex. Quadratic complexity seems manageable. Ofcourse now the solution become lot more sensitive to the errors in Graph G2. Assuming best case if I have no errors in G2 i.e. for every vertex I know correctly it's closest neighbored in the rectangular grid then optimizing distance between G1 and G2 should give me best correct alignment. This seems to be the best approach under current circumstance. As far as implementation goes I think I still have to use optimization package since there are not any readily and freely available function for inexact graph matching. Petr how do you feel about it. Appreciate your feedback. Hi. I am on vacations with only occasional access to the internet. One way to solve your problem is to formulate it in a way, in which it can be solved by some existing solver. I do not know, whether this is possible. Look at self-organizing maps (Kohonen maps). They try to map points with unknown mutual relationships to a 2 dimensional grid. However, i am not sure, whether the input to this method is suitable for your problem. Another way is to write your own program. I sent some suggestions in this direction in a previous email. If you want, we can discuss this in more detail next week, when i am back from vacations. All the best, Petr. __ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/Binary-Quadratic-Opt-tp4633521p4635416.html To unsubscribe from Binary Quadratic Opt?, click here. NAML -- View this message in context: http://r.789695.n4.nabble.com/Binary-Quadratic-Opt-tp4633521p4635457.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] evaluating expressions contained in a dataframe
Hello, Your eval(tests$rule[1]) needs a slight modification to get the results. Try this: tests-read.table(text= rule info$country=='Greenland' info$age50 ,sep=,header=TRUE,stringsAsFactors=FALSE) info-data.frame(first=rep(NA,5),country=c(GReenland,Iceland,Ireland,Greenland,Greenland) ,age=c(30,55,66,79,80),name=c(Mary,Paul,Robert,John,Ivan)) #1st condition eval(parse(text=tests$rule[1])) [1] FALSE FALSE FALSE TRUE TRUE #2nd condition eval(parse(text=tests$rule[2])) [1] FALSE TRUE TRUE TRUE TRUE # tests[1,1] condition: subset info info[eval(parse(text=tests$rule[1])),] first country age name 4 NA Greenland 79 John 5 NA Greenland 80 Ivan #tests[2,1] condition: subset info info[eval(parse(text=tests$rule[2])),] first country age name 2 NA Iceland 55 Paul 3 NA Ireland 66 Robert 4 NA Greenland 79 John 5 NA Greenland 80 Ivan A.K. - Original Message - From: New RUser newruser2...@gmail.com To: r-help@r-project.org Cc: Sent: Tuesday, July 3, 2012 12:24 PM Subject: [R] evaluating expressions contained in a dataframe #I have a dataframe called tests that contain character expressions. These characters are rules that use data from within another dataframe. Is there any way within R I can access the rules in the dataframe called tests, and then evaluate these rules? #An example may better explain what I am trying to accomplish: tests - data.frame(matrix(data=c(info$country == 'Greenland', info$age 50), nrow=2, ncol=1)) names(tests) - rule info - data.frame(matrix(data=NA, nrow=5, ncol=3)) names(info) - c(first, country, age) info$name - c(Mary, Paul, Robert, John, Ivan) info$country - c(GReenland, Iceland, Ireland, Greenland, Greenland) info$age - c(30, 55, 66, 79, 80) #e.g. for info$country == 'Greenland' #I want: info$country == 'Greenland' [1] FALSE FALSE FALSE TRUE TRUE #e.g. info$country == 'Greenland' info$age 50 #I tried this, but it does not work: eval(tests$rule[1]) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Adding Text to a persp plot using specific coordinates
Hi folks, I'm experiencing some hassle to add some text on a persp plot... Here is the code to generate the persp plot: x - seq(-1.95, 1.95, length = 30) y - seq(-1.95, 1.95, length = 30) z - outer(x, y, function(a,b) a*b^2) persp(x, y, z,xlim=c(-2,2),ylim=c(-2,2),zlim=c(-8,8), phi=30, theta=-30,nticks=8,ticktype=detailed) I've tried the text() function : text(x=0,y=0,z=0,texts=O) It returns the following error message: Error: evaluation nested too deeply: infinite recursion / options(expressions=)? I've seen that text3d from the rgl library could add text on a plot3d object or a persp object. But when I try this: text3d(x=0,y=0,z=1,texts=O) The result is that R opens a new window with the string O printed in the new window. And no text is added to my previous persp plot. I've seen that maybe using trans3d would help but after many tries, I gave up. This function did not allow me to plot text with the specified coordinates. Any ideas on how adding some text or labels to an existing persp plot using the coordinates I've entered? Thanking you in anticipation. Dubreuil Benjamin Research Assistant in Bioinformatics Michnick Lab - Dept of Biochemistry - Université de Montréal [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Return
Hi, I guess this is what you are looking for: dat1 - read.table(text= Date Stock1 Stock2 Stock3 Market 01/01/2000 1 2 3 4 01/02/2000 5 6 7 8 01/03/2000 1 2 3 4 01/04/2000 5 6 7 8 , header=TRUE, stringsAsFactors=FALSE) dat2-data.frame(Date=as.Date(dat1[,1],format=%m/%d/%Y),dat1[,2:4]) library(zoo) Stocks-as.matrix(dat1[,2:4]) Stockreturn-zoo(Stocks) Stockreturn-diff(log(Stockreturn)) Stockreturn-data.frame(Stockreturn) Stockreturn1-Stockreturn Stockreturn1[1,]-NA colnames(Stockreturn1[1,])-colnames(Stockreturn) dat3-data.frame(Date=dat2[,1], rbind(Stockreturn1[1,],Stockreturn)) row.names(dat3)-1:nrow(dat3) dat3 Date Stock1 Stock2 Stock3 1 2000-01-01 NA NA NA 2 2000-01-02 1.609438 1.098612 0.8472979 3 2000-01-03 -1.609438 -1.098612 -0.8472979 4 2000-01-04 1.609438 1.098612 0.8472979 A.K. - Original Message - From: Akhil dua akhil.dua...@gmail.com To: r-help@r-project.org Cc: Sent: Thursday, July 5, 2012 12:58 AM Subject: [R] Return Hello Every one I have data on Stock prices and I want to calculate the return on all the stocks and then replace all the stock prices with the returns can any one tell me how to do My data is in the format given below Date Stock1 Stock2 Stock3 01/01/2000 1 2 3 01/02/2000 5 6 7 01/03/2000 1 2 3 01/04/2000 5 6 7 Thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Maximum Likelihood Estimation Poisson distribution mle {stats4}
Hi everyone! I am using the mle {stats4} to estimate the parameters of distributions by MLE method. I have a problem with the examples they provided with the mle{stats4} html files. Please check the example and my question below! *Here is the mle html help file * http://stat.ethz.ch/R-manual/R-devel/library/stats4/html/mle.html http://stat.ethz.ch/R-manual/R-devel/library/stats4/html/mle.html *In the example provided with the help * od - options(digits = 5) x - 0:10*#generating Poisson counts* y - c(26, 17, 13, 12, 20, 5, 9, 8, 5, 4, 8) *#generating the frequesncies* ## Easy one-dimensional MLE: nLL - function(lambda) -sum(stats::dpois(y, lambda, log=TRUE)) *#they define the Poisson negative loglikelihood* fit0 - mle(nLL, start = list(lambda = 5), nobs = NROW(y)) * #they estimate the Poisson parameter using mle* fit0 *#they call the output* Call: mle(minuslogl = nLL, start = list(lambda = 5), nobs = NROW(y)) Coefficients: lambda 11.545 * #this is their estimated Lambda Vallue.* *Now my question is in a Poisson distribution the Maximum Likelihood estimator of the mean parameter lambda is the sample mean, so if we calculate the sample mean of that generated Poisson distribution manually using R we get the below!* sample.mean- sum(x*y)/sum(y) sample.mean [1] 3.5433 *This is the contradiction!! * Here I am getting the estimate as 3.5433(which is reasonable as most of the values are clustered around 3), but mle code gives the estimate 11.545(which may not be correct as this is out side the range 0:10) Why this contradiction? -- View this message in context: http://r.789695.n4.nabble.com/Maximum-Likelihood-Estimation-Poisson-distribution-mle-stats4-tp4635464.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adding Text to a persp plot using specific coordinates
On 05.07.2012 06:45, Benjamin Dubreuil wrote: Hi folks, I'm experiencing some hassle to add some text on a persp plot... Here is the code to generate the persp plot: x - seq(-1.95, 1.95, length = 30) y - seq(-1.95, 1.95, length = 30) z - outer(x, y, function(a,b) a*b^2) persp(x, y, z,xlim=c(-2,2),ylim=c(-2,2),zlim=c(-8,8), phi=30, theta=-30,nticks=8,ticktype=detailed) I've tried the text() function : text(x=0,y=0,z=0,texts=O) It returns the following error message: Error: evaluation nested too deeply: infinite recursion / options(expressions=)? I've seen that text3d from the rgl library could add text on a plot3d object or a persp object. But when I try this: text3d(x=0,y=0,z=1,texts=O) The result is that R opens a new window with the string O printed in the new window. And no text is added to my previous persp plot. I've seen that maybe using trans3d would help but after many tries, I gave up. This function did not allow me to plot text with the specified coordinates. Yes, trans3d: x - seq(-1.95, 1.95, length = 30) y - seq(-1.95, 1.95, length = 30) z - outer(x, y, function(a,b) a*b^2) pmat - persp(x, y, z,xlim=c(-2,2),ylim=c(-2,2),zlim=c(-8,8), phi=30, theta=-30,nticks=8,ticktype=detailed) text(trans3d(0,0,0,pmat), Hello World!) Best, Uwe Ligges Any ideas on how adding some text or labels to an existing persp plot using the coordinates I've entered? Thanking you in anticipation. Dubreuil Benjamin Research Assistant in Bioinformatics Michnick Lab - Dept of Biochemistry - Université de Montréal [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to analyse non parametric test
Please i am Anilkumar G.V studieng MBA final sem for this data i want to find out non parametric tests please help me -- Anilkumar G.V Jr,MBA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] stlf algorithm failure
Dear R-helpers, currently I am writing a thesis using R in forecasting scenarios. No the time is shortening and I wanted to set up all test cases, including the STLF algorithm of the forecast package. I am using 15 minutes (96 a day) data of two weeks to forecast one week... The frequency used is 672 (so two weeks) This should be fine because it has to be at least 2 priods.. Unfortunately the stlf algorithm claims that my time series is not periodic or has less than two periods . But both is definitely true... If I add one more value it works but this would falsen the forecast and lead to more problems since I use different granularities and had to add 8 more values in the worst case I have this problem only with the stlf algorithm. Holt Winters and other ones work perfectly... Do you know any solution for this problem? A fast answer would be really great because I am very close to the release of my work for my university Kind regards and thanks in advance Christian Ihr WEB.DE Postfach immer dabei: die kostenlose WEB.DE Mail App für iPhone und Android. [1]https://produkte.web.de/freemail_mobile_startseite/ References 1. https://produkte.web.de/freemail_mobile_startseite/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MS-DOS script R
I have the following in a batch file TLT.BAT: C:\Program Files\R\R-2.13.2\bin\x64\R.exe CMD BATCH C:\Users\Frank\Documents\R\Scripts\TLT_2012.txt C:\Users\Frank\Documents\R\Scripts\TLT_2012.out PAUSE The TLT_2012.txt has my R code in it. The TLT_2012.out has the output from R. Look to TLT_2012.out for problems with your batch file. Frank Chicago, IL Date: Wed, 4 Jul 2012 05:44:01 -0700 From: cindy.dolom...@insa-lyon.fr To: r-help@r-project.org Subject: [R] MS-DOS script R Hi, I would like to run R with a script from MS-DOS. R is in My Documents and my script too. How to do? -- View this message in context: http://r.789695.n4.nabble.com/MS-DOS-script-R-tp4635398.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Maximum Likelihood Estimation Poisson distribution mle {stats4}
-Original Message- sample.mean- sum(x*y)/sum(y) sample.mean [1] 3.5433 *This is the contradiction!! * Here I am getting the estimate as 3.5433(which is reasonable as most of the values are clustered around 3), but mle code gives the estimate 11.545(which may not be correct as this is out side the range 0:10) Why this contradiction? Ermm.. the sample mean is mean(y) which is 11.545 I deduce that you and the help page have different views on what the sample y represents. i also note that x does not figure at all in the help page's log-likelihood, suggesting that y is a simple set of counts, whereas you have interpreted x as the number of instances of each y. That appears not to be the case. S *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Maximum Likelihood Estimation Poisson distribution mle {stats4}
On Jul 5, 2012, at 10:48 , chamilka wrote: Hi everyone! I am using the mle {stats4} to estimate the parameters of distributions by MLE method. I have a problem with the examples they provided with the mle{stats4} html files. Please check the example and my question below! *Here is the mle html help file * http://stat.ethz.ch/R-manual/R-devel/library/stats4/html/mle.html http://stat.ethz.ch/R-manual/R-devel/library/stats4/html/mle.html *In the example provided with the help * od - options(digits = 5) x - 0:10*#generating Poisson counts* y - c(26, 17, 13, 12, 20, 5, 9, 8, 5, 4, 8) *#generating the frequesncies* ## Easy one-dimensional MLE: nLL - function(lambda) -sum(stats::dpois(y, lambda, log=TRUE)) *#they define the Poisson negative loglikelihood* fit0 - mle(nLL, start = list(lambda = 5), nobs = NROW(y)) * #they estimate the Poisson parameter using mle* fit0 *#they call the output* Call: mle(minuslogl = nLL, start = list(lambda = 5), nobs = NROW(y)) Coefficients: lambda 11.545 * #this is their estimated Lambda Vallue.* *Now my question is in a Poisson distribution the Maximum Likelihood estimator of the mean parameter lambda is the sample mean, so if we calculate the sample mean of that generated Poisson distribution manually using R we get the below!* sample.mean- sum(x*y)/sum(y) sample.mean [1] 3.5433 *This is the contradiction!! * Here I am getting the estimate as 3.5433(which is reasonable as most of the values are clustered around 3), but mle code gives the estimate 11.545(which may not be correct as this is out side the range 0:10) Why this contradiction? You misunderstand the setup. y are 11 Poisson counts, x is a dose variable used further down in the example. So mean(y) [1] 11.54545 Your sample.mean would be relevant if there were 127 count measurements, 26 of which were 0, 17 were 1, etc. But then the likelihood would be different: x - 0:10 nLL - function(lambda) -sum(y*stats::dpois(x, lambda, log=TRUE)) mle(nLL, start = list(lambda = 5), nobs = NROW(y)) Call: mle(minuslogl = nLL, start = list(lambda = 5), nobs = NROW(y)) Coefficients: lambda 3.54328 -- View this message in context: http://r.789695.n4.nabble.com/Maximum-Likelihood-Estimation-Poisson-distribution-mle-stats4-tp4635464.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Comparing crossing survival curves
hi isabel, You have to decide if focus is on the survival curves or hazards.. Crossing hazards do not imply crossing survival curves If you are dealing with crossing hazards, and you are interested in testing for an effect of a covariate (presumably with a crossing hazard effect), then a standard Cox model framework based on the martingale representation (start, stop, event) (rather than (time, event)) will suffice Finally, if you are interested in estimating the crossing point you could have a look to the package flexCrossHaz (currently in the Archive.. Let me know if you are interested in becoming maintainer..:-) ) http://cran.r-project.org/src/contrib/Archive/flexCrossHaz/ best, vito Il 05/07/2012 12.37, Isabel Borges ha scritto: Hi I want to compare the survival curves in two groups. Because the hazards are not proportional (the curves cross) the log rank test or Cox proportional hazard test are not suitable. How should such curves be compared? Comands are welcome Thanks in advance __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 23895240 fax: 091 485726 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] 2D image correlation/Packages for 2D CCA?
Hello, I was just wondering if anyone could recommend a package for R that calculates 2D image correlation, e.g. by 2D canonical correspondence analysis? Thanks a lot! MG [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Import single variables from SPSS
Dear all, I have got a very big SPSS dataframe and I would like to just import one or a few variables (the dataframe has got a lot of colums). I used the package foreign but I couldn't find anything in there for my problem. Thank you very much for your help in advance. Marion [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to analyse non parametric test
Please read the posting guidelines. So far we have no idea of what you are doing or how you are doing it. John Kane Kingston ON Canada -Original Message- From: anilgv...@gmail.com Sent: Thu, 5 Jul 2012 15:11:20 +0530 To: r-help@r-project.org Subject: [R] how to analyse non parametric test Please i am Anilkumar G.V studieng MBA final sem for this data i want to find out non parametric tests please help me -- Anilkumar G.V Jr,MBA Share photos screenshots in seconds... TRY FREE IM TOOLPACK at http://www.imtoolpack.com/default.aspx?rc=if1 Works in all emails, instant messengers, blogs, forums and social networks. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] EM algorithm to find MLE of coeff in mixed effects model
I don't believe that R-help permits pdf files. A useful workaround is to post it to a file hosting site like MediaFire and post the link here. John Kane Kingston ON Canada -Original Message- From: jimmycl...@gmail.com Sent: Wed, 4 Jul 2012 22:56:02 -0400 To: pauljoh...@gmail.com Subject: Re: [R] EM algorithm to find MLE of coeff in mixed effects model Dear Paul, Thank you for your suggestion. I was moved by the fact that people are so nice to help learners and ask for nothing. With your help, I made some changes to my code and add more comments, try to make things more clear. If this R email list allow me to upload a pdf file to illustrate the formula, it will be great. The reason I do not use lme4 is that later I plan to use this for a more complicated model (Y,T), Y is the same response of mixed model and T is a drop out process. (Frankly I did it for the more complicated model first and got NaN hessian after one iteration, so I turned to the simple model.) In the code, the m loop is the EM iterations. About efficiency, thank you again for pointing it out. I compared sapply and for loop, they are tie and for loop is even better sometimes. In a paper by Bates, he mentioned that the asymptotic properties of beta is kind of independent of the other two parameters. But it seems that paper was submitted but I did not find it on google scholar. Not sure if it is the reason for my results. That is why I hope some expert who have done some simulation similar may be willing to give some suggestion. I may turn to other methods to calculate the conditional expection of the latent variable as the same time. Modified code is as below: # library(PerformanceAnalytics) # install.packages(gregmisc) # install.packages(statmod) library(gregmisc) library(MASS) library(statmod) ## function to calculate loglikelihood loglike - function(datai = data.list[[i]], vvar = var.old, beta = beta.old, psi = psi.old) { temp1 - -0.5 * log(det(vvar * diag(nrow(datai$Zi)) + datai$Zi %*% psi %*% t(datai$Zi))) temp2 - -0.5 * t(datai$yi - datai$Xi %*% beta) %*% solve(vvar * diag(nrow(datai$Zi)) + datai$Zi %*% psi %*% t(datai$Zi)) %*% (datai$yi - datai$Xi %*% beta) temp1 + temp2 } ## functions to evaluate the conditional expection, saved as Efun v2.R ## Eh1new=E(bibi') ## Eh2new=E(X'(y-Zbi)) ## Eh3new=E(bi'Z'Zbi) ## Eh4new=E(Y-Xibeta-Zibi)'(Y-Xibeta-Zibi) ## Eh5new=E(Xibeta-yi)'Zibi ## Eh6new=E(bi) Eh1new - function(datai = data.list[[i]], weights.m = weights.mat) { bi - datai$b tempb - bi * rep(sqrt(weights.m[, 1] * weights.m[, 2]), 2) #quadratic b, so needsqrt t(tempb) %*% tempb/pi } # end of Eh1 # Eh2new=E(X'(y-Zbi)) Eh2new - function(datai = data.list[[i]], weights.m = weights.mat) { bi - datai$b tempb - bi * rep(weights.m[, 1] * weights.m[, 2], 2) tt - t(datai$Xi) %*% datai$yi - t(datai$Xi) %*% datai$Zi %*% colSums(tempb)/pi c(tt) } # end of Eh2 # Eh3new=E(b'Z'Zbi) Eh3new - function(datai = data.list[[i]], weights.m = weights.mat) { bi - datai$b tempb - bi * rep(sqrt(weights.m[, 1] * weights.m[, 2]), 2) #quadratic b, so need sqrt sum(sapply(as.list(data.frame(datai$Zi %*% t(tempb))), function(s) { sum(s^2) }))/pi } # end of Eh3 # Eh4new=E(Y-Xibeta-Zibi)'(Y-Xibeta-Zibi) Eh4new - function(datai = data.list[[i]], weights.m = weights.mat, beta = beta.old) { bi - datai$b tt - sapply(as.list(ata.frame(c(datai$yi - datai$Xi %*% beta) - datai$Zi %*% t(bi))), function(s) { sum(s^2) }) * (weights.m[, 1] * weights.m[, 2]) sum(tt)/pi } # end of Eh4 Eh4newv2 - function(datai = data.list[[i]], weights.m = weights.mat, beta = beta.old) { bi - datai$b v - weights.m[, 1] * weights.m[, 2] temp - c() for (i in 1:length(v)) { temp[i] - sum(((datai$yi - datai$Xi %*% beta - datai$Zi %*% bi[i, ]) * sqrt(v[i]))^2) } sum(temp)/pi } # end of Eh4 # Eh5new=E(Xibeta-yi)'Zib Eh5new - function(datai = data.list[[i]], weights.m = weights.mat, beta = beta.old) { bi - datai$b tempb - bi * rep(weights.m[, 1] * weights.m[, 2], 2) t(datai$Xi %*% beta - datai$yi) %*% (datai$Zi %*% t(tempb)) sum(t(datai$Xi %*% beta - datai$yi) %*% (datai$Zi %*% t(tempb)))/pi } Eh6new - function(datai = data.list[[i]], weights.m = weights.mat) { bi - datai$b tempw - weights.m[, 1] * weights.m[, 2] for (i in 1:nrow(bi)) { bi[i, ] - bi[i, ] * tempw[i] } colMeans(bi)/pi } # end of Eh6 ## the main R script ### initial the values and generate the data ## n - 100 #number of observations beta - c(-0.5, 1) #true coefficient of fixed effects vvar - 2
Re: [R] Return on Stock Market
if(!require(TTR)){install.packages(TTR); library(TTR)} ? ROC Michael On Wed, Jul 4, 2012 at 11:20 PM, Akhil dua akhil.dua...@gmail.com wrote: Hello Every one I have data on Stock prices and I want to calculate the return on all the stocks and then replace all the stock prices with the returns can any one tell me how to do My data is in the format given below Date Stock1 Stock2 Stock3 01/01/20001 2 3 01/02/20005 6 7 01/03/20001 2 3 01/04/20005 6 7 Thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Create a function automatically from lm formula and coefficients?
Hi, I need to create a function from lm formula and coefficients to use it in my c++ code later. but when I do it as you said require(rms) f - ols( ) g - Function(f) g(x1=2,x2=3,...) I realize that it didn't give me the same result as predict.lm? how can I reach a function to give me the same results in predict? I really need it, please answer me. Regards, Taheri -- View this message in context: http://r.789695.n4.nabble.com/Create-a-function-automatically-from-lm-formula-and-coefficients-tp4433854p4635470.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Hosmer-Lemeshow test for Cox model
Dear list, Usually we use Hosmer-Lemeshow test to test the goodness of fit for logistic model, but if I use it to test for Cox model, how can I get the observed probability for each group? Suppose I calculated the 5-year predicted probability using Cox model, then I split the dataset into 10 group according to this predicted probability. We should compare the observed probability with predicted probability within each group,but how to calculate this observed probability, should I use Kaplan-Meier to estimate it? how should I modify the following program,thanks. hosmerlem = function(y, yhat, g=10) { cutyhat = cut(yhat, breaks = quantile(yhat, probs=seq(0, 1, 1/g)), include.lowest=TRUE) obs = xtabs(cbind(1 - y, y) ~ cutyhat) expect = xtabs(cbind(1 - yhat, yhat) ~ cutyhat) chisq = sum((obs - expect)^2/expect) P = 1 - pchisq(chisq, g - 2) return(list(chisq=chisq,p.value=P)) } -- View this message in context: http://r.789695.n4.nabble.com/Hosmer-Lemeshow-test-for-Cox-model-tp4635482.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] function curve() swap axes
Dear all, I'm using the curve() function to plot discharge Q against water depth a. However, I would like to have a graph of water depth a plotted against discharge Q. How can this be done? Minimal working example: S0 = 0.004 n = 0.04 tanalpha = 1.4/1.5 par(mar = c(5,5,1,1)) # b, l, t, r curve((sqrt(S0)/n)*(0.035+(0.7+(x-0.1)/tanalpha)*(x-0.1))*((0.035+(0.7+(x-0.1)/tanalpha)*(x-0.1))/(2*sqrt((0.7/2)^2+0.1^2)+2*sqrt((x-0.1)^2+((x-0.1)/tanalpha)^2)))^(2/3),0.1,1.55, lwd = 3, col = royalblue4, ann = F, axes = T) title(xlab = parse(text='a~bgroup([, m, ])')) title(ylab = parse(text='Q~bgroup([, m^3/s, ])')) box() I tried to find the inverse function, but that doesn't seem to exist. Thank you for support, Boudewijn [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Maximum Likelihood Estimation Poisson distribution mle {stats4}
Thank you S Ellison-2 for your reply. I will understand it with Prof.Peter Dalgaard's answer.. -- View this message in context: http://r.789695.n4.nabble.com/Maximum-Likelihood-Estimation-Poisson-distribution-mle-stats4-tp4635464p4635484.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to check convergence of arima model
Thanks for your advice. Sajeeka From: Bert Gunter gunter.ber...@gene.com Cc: Rui Barradas ruipbarra...@sapo.pt; r-help@r-project.org r-help@r-project.org Sent: Thursday, July 5, 2012 12:43 AM Subject: Re: [R] how to check convergence of arima model 1. This is a statistical question. Please do not post further to this list. It is about R, and you have summarily dismissed all attempts to answer your R questions as unhelpful. So you need to look elsewhere. 2. Consult a statistician -- you use the statistical words, but do not understand what they mean. There often is NO single model that both maximizes likelihood and minimizes AIC (depending on the models one fits). -- Bert ote: Hi, Thanks for your explanation. But still I didn't get the answer which I need. You have mentioned the code which can obtain AIC values for all models at once. My question is; the procedure of selecting the suitable model to predict using R package. If I select the model which produces the smallest AIC and maximized log likelihood values, as the suitable model is it correct? Sajeeka Nanayakkara From: Rui Barradas ruipbarra...@sapo.pt Cc: r-help@r-project.org Sent: Wednesday, July 4, 2012 3:38 PM Subject: Re: [R] how to check convergence of arima model Hello, Put the fitted models in a list and then use lapply with AIC(). Like this set.seed(1) x - 1:100 + sqrt(1:100 + runif(100)) + rnorm(100) models - list() models[[1]] - arima(diff(x), order = c(1, 0, 0)) # Just to show models[[2]] - arima(diff(x), order = c(1, 0, 1)) # several models[[3]] - arima(diff(x), order = c(2, 0, 2)) # models models[[4]] - arima(diff(x), order = c(2, 0, 3)) lapply(models, AIC) Or run each model at a time through AIC(), whichever suits better. Hope this helps Rui Barradas Em 04-07-2012 10:22, Sajeeka Nanayakkara escreveu: Hi, I need to predict exchange rates using time series. So according to ACF and PACF knowledge, mixed model (ARIMA) be the appropriate and now, I need to find order of the model (p,d,q). So, several models were fitted to select the suitable model using arima(). Could you please tell me the procedure of selecting the correct order as I don't have enough time to search? Thank you. Sajeeka Nanayakkara *From:* Rui Barradas ruipbarra...@sapo.pt *Cc:* r-help@r-project.org *Sent:* Wednesday, July 4, 2012 2:58 PM *Subject:* Re: [R] how to check convergence of arima model Hello, Inline. Em 04-07-2012 09:35, Sajeeka Nanayakkara escreveu: Hi, Sorry, since I didn't see the earlier message I resend it. I read the help page that you mentioned. But the problem is for all models, code is zero. According to that, all models were converged. Considering AIC value the best model is selected. No, arima() does not select models by AIC. That is the default behavior of ar(); arima() does NOT select models, it selects, using optim, values for the parameters of a specified model. You must choose the orders yourself. Hope this helps, Rui Barradas Is that correct procedure? The R code which was used is; model1-arima(rates,c(1,1,1)) model1 model1$code [1] 0 Sajeeka Nanayakkara *From:* Rui Barradas ruipbarra...@sapo.pt mailto:ruipbarra...@sapo.pt *Cc:* r-help@r-project.org mailto:r-help@r-project.org *Sent:* Wednesday, July 4, 2012 2:01 PM *Subject:* Re: [R] how to check convergence of arima model Hello, Sorry, but do you read the answers to your posts? Inline. Em 04-07-2012 08:02, Sajeeka Nanayakkara escreveu: I have already fitted several models using R code; arima(rates,c(p,d,q)) And I have already answered to a question starting like this yesterday. In the mean time, the subject line changed. As I heard, best model produce the smallest AIC value, but maximum likelihood estimation procedure optimizer should converge. How to check whether maximum likelihood estimation procedure optimizer has converged or not? Yes, it was this question, the subject line was 'question'... ... And the answer was: read the manual, that I quoted, by the way. It now changed to: read the manual, period. Rui Barradas [[alternative HTML version deleted]] __ R-help@r-project.org mailto:R-help@r-project.org mailto:R-help@r-project.org mailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]]
[R] i need help in documentation
Hi, i am new in R development , so i need help in documentation from comments in R file for example sample-function(filepath) { ##title read csv file ##author Purushoth ##description this function is used to read a csv file output-read.csv(filepath) return(output) } i need to get html document file using this comments Thanks B.Purushothaman -- View this message in context: http://r.789695.n4.nabble.com/i-need-help-in-documentation-tp4635471.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] vector entry in matix
hi, i'm trying to figure out if there's any possibility to write a whole vector into a matrix or data.frame or something like that. i don't mean transormation. Here an example: [,1] [,2] [1,] a d [2,] b e [3,] c f where e.g. a is a-c(0,1) vector of length 2, b a vector of length 4,... (i know that the matrix enties are strings :)). i'm trying to put some variables into a matrix or data.frame, but i couldn't handle it. is something like that possible in R? thanks for your help! -- View this message in context: http://r.789695.n4.nabble.com/vector-entry-in-matix-tp4635478.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] skipped correlation
hello,Everyone! I am a freshman to use R. Can anybody tell me where has scor function which achieves the skipped correlation? Thank you very much! Bset wishes! Jiaolong Qin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Invalid input in 'utf8towcs' when saving script file
Hello, When I try to save my script file before closing the R console session I get this error. Error: invalid input 'C:\Documents and Settings\\\datafile' in 'utf8towcs' Does anyone know what can cause this error? I use the RGui (R verison 2.14.0) in Windows and the problem appears when I try to re-save the script file. Using Save as and rename it works. Kind regards, Marine Andersson __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] function curve() swap axes
On 05/07/2012 8:34 AM, Boudewijn Verkooijen wrote: Dear all, I'm using the curve() function to plot discharge Q against water depth a. However, I would like to have a graph of water depth a plotted against discharge Q. How can this be done? curve() is designed for plotting y vs x where y is a function of x, so I think you need to do it yourself. Simply compute a vector of a values using seq(), compute the corresponding Q values using your formula, and then use plot(Q, a, type=l) You can add the lwd and other arguments too if you like. Duncan Murdoch Minimal working example: S0 = 0.004 n = 0.04 tanalpha = 1.4/1.5 par(mar = c(5,5,1,1)) # b, l, t, r curve((sqrt(S0)/n)*(0.035+(0.7+(x-0.1)/tanalpha)*(x-0.1))*((0.035+(0.7+(x-0.1)/tanalpha)*(x-0.1))/(2*sqrt((0.7/2)^2+0.1^2)+2*sqrt((x-0.1)^2+((x-0.1)/tanalpha)^2)))^(2/3),0.1,1.55, lwd = 3, col = royalblue4, ann = F, axes = T) title(xlab = parse(text='a~bgroup([, m, ])')) title(ylab = parse(text='Q~bgroup([, m^3/s, ])')) box() I tried to find the inverse function, but that doesn't seem to exist. Thank you for support, Boudewijn [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] empty cell when running GEE for binary data
Hi all, My data are binary and within-subject correlated; there are three factors of interest and two of them are within-subject. So, I have considered modelling the data using a binomial regression with a GEE approach (which can be achieved using the function geeglm). One problem is that I got some empty cells, so I can't simply use logit as a link function. I was wondering if you know any existing R solution for this? also, I tired to use cauchit as the link function, but it turned out that within the function geeglm cauchit is not available. Any idea on this? Many thanks! Yue -- View this message in context: http://r.789695.n4.nabble.com/empty-cell-when-running-GEE-for-binary-data-tp4635483.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Maximum Likelihood Estimation Poisson distribution mle {stats4}
Thank you very much Professor .Peter Dalgaard for your kind explanations.. This made my work easy.. I am struggling with this for more than 2 days and now I got the correct reply. Thank again. -- View this message in context: http://r.789695.n4.nabble.com/Maximum-Likelihood-Estimation-Poisson-distribution-mle-stats4-tp4635464p4635485.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] 7 days confusion over lists
Hello, I am a Masters student and I am working on my thesis modelling smallholder farms using a program in R. I have modified the original code and I am having some issues with lists that I cannot figure out. Originally, I had list file defining lists such as: Param, Crop1, Crop1, Soil, etc. (ex. Param - list() ). Their subsets were listed as Crop1$CContent for example and there was quite a few of them. There is a run file that gets the list file going in the following code: #--Initialising the lists to store variable values tmp - init_lists() Param - tmp[[1]] Crop1 - tmp[[2]] Fert- tmp[[3]] Meteo - tmp[[4]] Soil- tmp[[5]] RainPattern - tmp[[6]] Crop2 - tmp[[7]] Cropres - tmp[[8]] rm(tmp) The problem here is that the lists get confused with each other; that is elements of Soil appear in Crop2. In the run file, I changed the order of the lists (for example Fert - tmp[[5]] and Crop2 -tmp[[3]]) and it changes which ones are confused with what; with some lists not being confused at all. I cannot find why this is a problem as the tmps are removed at the end of the command list shown below and each element is clearly defined as a part of its given list with (list name)$(Sublist maybe)$(specific element name) Either way, I changed the original lists to be more specific, for example: Crop$ - list( CContent - NA, NContent - NA, PContent - NA, KContent - NA, K - list( uptakegivenN, uptakegivenP), dryMatterContent ...etc. The issue then became that it stops at a certain point accepting the assigned values and keeps them as null. That is, the list contains an element that states Crop2$minYieldN - NA but the value becomes NULL when it is run. Also, the list, say Param, contains both: [[1]] [1] NA [[2]] [1] NA [[3]] [[3]][[1]] [1] NA [[3]][[2]] [1] NA and some of the value names (up to a point) $inertCRTR [1] 0.001 $humificationFactor [1] 0.25 $fractionStabilisedSOMC [1] 0.2 $growthEffectMicroorganisms [1] 0.6 Is it perhaps that the list is too long? Please let me know if anything comes to mind, it would be great help. Thank you! Regards, -Igor Milosavljevic Wageningen University and Research Center -- View this message in context: http://r.789695.n4.nabble.com/7-days-confusion-over-lists-tp4635489.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Hosmer-Lemeshow test for Cox model
Any method that requires binning is problematic. Instead, take a look at the calibrate function in the rms package. There is a new option for continuous calibration curves for survival models. Frank jane.wong wrote Dear list, Usually we use Hosmer-Lemeshow test to test the goodness of fit for logistic model, but if I use it to test for Cox model, how can I get the observed probability for each group? Suppose I calculated the 5-year predicted probability using Cox model, then I split the dataset into 10 group according to this predicted probability. We should compare the observed probability with predicted probability within each group,but how to calculate this observed probability, should I use Kaplan-Meier to estimate it? how should I modify the following program,thanks. hosmerlem = function(y, yhat, g=10) { cutyhat = cut(yhat, breaks = quantile(yhat, probs=seq(0, 1, 1/g)), include.lowest=TRUE) obs = xtabs(cbind(1 - y, y) ~ cutyhat) expect = xtabs(cbind(1 - yhat, yhat) ~ cutyhat) chisq = sum((obs - expect)^2/expect) P = 1 - pchisq(chisq, g - 2) return(list(chisq=chisq,p.value=P)) } - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Hosmer-Lemeshow-test-for-Cox-model-tp4635482p4635492.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Invalid input in 'utf8towcs' when saving script file
On 05/07/2012 10:21 AM, Marine Andersson wrote: Hello, When I try to save my script file before closing the R console session I get this error. Error: invalid input 'C:\Documents and Settings\\\datafile' in 'utf8towcs' Does anyone know what can cause this error? I imagine it is converting some non-ascii character. But 2.14.0 isn't current, so it might be a bug that has since been fixed. Duncan Murdoch I use the RGui (R verison 2.14.0) in Windows and the problem appears when I try to re-save the script file. Using Save as and rename it works. Kind regards, Marine Andersson __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] function curve() swap axes
On Jul 5, 2012, at 8:34 AM, Boudewijn Verkooijen wrote: Dear all, I'm using the curve() function to plot discharge Q against water depth a. However, I would like to have a graph of water depth a plotted against discharge Q. How can this be done? Minimal working example: S0 = 0.004 n = 0.04 tanalpha = 1.4/1.5 par(mar = c(5,5,1,1)) # b, l, t, r curve((sqrt(S0)/n)*(0.035+(0.7+(x-0.1)/ tanalpha)*(x-0.1))*((0.035+(0.7+(x-0.1)/tanalpha)*(x-0.1))/ (2*sqrt((0.7/2)^2+0.1^2)+2*sqrt((x-0.1)^2+((x-0.1)/ tanalpha)^2)))^(2/3),0.1,1.55, lwd = 3, col = royalblue4, ann = F, axes = T) title(xlab = parse(text='a~bgroup([, m, ])')) title(ylab = parse(text='Q~bgroup([, m^3/s, ])')) box() I tried to find the inverse function, but that doesn't seem to exist. R does not perform computer algebra. If you wanted a numerical approach, you can construct a close fit to that function with approxfun() and then reverse the x and y roles to create an inverse. Actually, since curve returns a list with x and y components you could also do this: xycurv - curve((sqrt(S0)/n)*(0.035+(0.7+(x-0.1)/ tanalpha)*(x-0.1))*((0.035+(0.7+(x-0.1)/tanalpha)*(x-0.1))/ (2*sqrt((0.7/2)^2+0.1^2)+2*sqrt((x-0.1)^2+((x-0.1)/ tanalpha)^2)))^(2/3),0.1,1.55, + lwd = 3, col = royalblue4, ann = F, axes = T) plot(xycurv$y, xycurv$x) -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Different level set when predicting with e1071's Naive Bayes classifier
Hi! I'm using the Naive Bayes classifier provided by the e1071 package ( http://cran.r-project.org/web/packages/e1071) and I've noticed that the predict function has a different behavior when the level set of the columns used for prediction is different from the ones used for fitting. From inspecting the predict.naiveBayes I came to the conclusion that this is due to the conversion of factors to their internal codes using the data.matrix function. For example, consider the following piece of R code: library(mlbench) library(e1071) data(HouseVotes84) model - naiveBayes(Class ~ ., data = HouseVotes84) head(HouseVotes84) Class V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 1 republicann y nyy y n n n y NAy y y ny 2 republicann y nyy y n n n nny y y n NA 3 democrat NA y y NAy y n n n nyn y y nn 4 democratn y yn NA y n n n nyn y n ny 5 democraty y yny y n n n ny NA y y yy 6 democratn y yny y n n n nnn y y yy predict(model, HouseVotes84[1,-1]) [1] republican Levels: democrat republican new.data - data.frame(V1=n, V2=y, V3=n, V4=y, V5=y, V6=y, V7=n, V8=n, V9=n, V10=y, V11=NA_character_, V12=y, V13=y, V14=y, V15=n, V16=y, stringsAsFactors=TRUE) predict(model, new.data) [1] democrat Levels: democrat republican I haven't used other classification methods in R, so I'm unsure if this is what is expected from the application of the predict function. Is this a bug or the expected behavior? Thanks! -- Joao. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adjusting MaxNwts in MICE Package
Hi Rachel, I am having the same problem! My data set is positing around 4400 weights required. Have you found a solution to this? Does anyone else know how to adjust this default to run MICE successfully? Thanks, Justin -- View this message in context: http://r.789695.n4.nabble.com/Adjusting-MaxNwts-in-MICE-Package-tp3178628p4635495.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] i need help in documentation
R does not, to my knowledge, have anything quite like Python's docstrings (which seems to be what you are talking about) but you might look at Roxygen2 on CRAN which should get you started in this direction. Best, Michael On Thu, Jul 5, 2012 at 7:37 AM, purushothaman purushothama...@ge.com wrote: Hi, i am new in R development , so i need help in documentation from comments in R file for example sample-function(filepath) { ##title read csv file ##author Purushoth ##description this function is used to read a csv file output-read.csv(filepath) return(output) } i need to get html document file using this comments Thanks B.Purushothaman -- View this message in context: http://r.789695.n4.nabble.com/i-need-help-in-documentation-tp4635471.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] vector entry in matix
It is possible to put dimensionality on a list (i.e., a generic vector), which might be what you're looking for. x - list(1:4, letters[1:4], function(x,y) x + y, rnorm(50)) dim(x) - c(2,2) x[[1,2]] x[[2,2]] x[[3,2]] # Error Best, Michael On Thu, Jul 5, 2012 at 8:19 AM, Thomas C. thomas.csa...@gmail.com wrote: hi, i'm trying to figure out if there's any possibility to write a whole vector into a matrix or data.frame or something like that. i don't mean transormation. Here an example: [,1] [,2] [1,] a d [2,] b e [3,] c f where e.g. a is a-c(0,1) vector of length 2, b a vector of length 4,... (i know that the matrix enties are strings :)). i'm trying to put some variables into a matrix or data.frame, but i couldn't handle it. is something like that possible in R? thanks for your help! -- View this message in context: http://r.789695.n4.nabble.com/vector-entry-in-matix-tp4635478.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 7 days confusion over lists
Could you make a reproducible example? [http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example] I can't run any of your code as is. A few comments inline. On Thu, Jul 5, 2012 at 9:44 AM, Wageningen-eR igor.milosavlje...@wur.nl wrote: Hello, I am a Masters student and I am working on my thesis modelling smallholder farms using a program in R. I have modified the original code and I am having some issues with lists that I cannot figure out. Originally, I had list file defining lists such as: Param, Crop1, Crop1, Soil, etc. (ex. Param - list() ). Their subsets were listed as Crop1$CContent for example and there was quite a few of them. There is a run file that gets the list file going in the following code: #--Initialising the lists to store variable values tmp - init_lists() Param - tmp[[1]] Crop1 - tmp[[2]] Fert- tmp[[3]] Meteo - tmp[[4]] Soil- tmp[[5]] RainPattern - tmp[[6]] Crop2 - tmp[[7]] Cropres - tmp[[8]] rm(tmp) The problem here is that the lists get confused with each other; that is elements of Soil appear in Crop2. In the run file, I changed the order of the lists (for example Fert - tmp[[5]] and Crop2 -tmp[[3]]) and it changes which ones are confused with what; with some lists not being confused at all. I cannot find why this is a problem as the tmps are removed at the end of the command list shown below and each element is clearly defined as a part of its given list with (list name)$(Sublist maybe)$(specific element name) Either way, I changed the original lists to be more specific, for example: Crop$ - list( CContent - NA, NContent - NA, PContent - NA, KContent - NA, K - list( uptakegivenN, uptakegivenP), dryMatterContent ...etc. This almost certainly doesn't work. Crop$ - ... should be a syntax error. The issue then became that it stops at a certain point accepting the assigned values and keeps them as null. That is, the list contains an element that states Crop2$minYieldN - NA but the value becomes NULL when it is run. Also, the list, say Param, contains both: [[1]] [1] NA [[2]] [1] NA [[3]] [[3]][[1]] [1] NA [[3]][[2]] [1] NA and some of the value names (up to a point) $inertCRTR [1] 0.001 $humificationFactor [1] 0.25 $fractionStabilisedSOMC [1] 0.2 $growthEffectMicroorganisms [1] 0.6 Is it perhaps that the list is too long? Please let me know if anything comes to mind, it would be great help. No, it's not a length thing. It might be a partial matching thing though. x - list() x$a - 5 x$ab - 10 x$ba - 15 print(x$b) # where's x$b? It's recommended to use the double `[[` in programming applications to avoid this trouble. Best, Michael Thank you! Regards, -Igor Milosavljevic Wageningen University and Research Center -- View this message in context: http://r.789695.n4.nabble.com/7-days-confusion-over-lists-tp4635489.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] skipped correlation
library(sos) findFun(scor) will get you going in the right direction but I'm not sure I see a scor function (I presume you're looking for the robust correlation metric). Do you have a citation for such a function? For a work around, robust::covRob gives robust covariance which possibly could be scaled down to correlation. Best, Michael On Thu, Jul 5, 2012 at 8:49 AM, qinjiaolong qinjiaol...@126.com wrote: hello,Everyone! I am a freshman to use R. Can anybody tell me where has scor function which achieves the skipped correlation? Thank you very much! Bset wishes! Jiaolong Qin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 7 days confusion over lists
I think you need to (re)read the posting guide and An Introduction to R and try posting again. Some food for thought: a) You refer to list files, but this is not a standard term. As the PG says, you need to supply a self-contained example, which would either include internal objects in dput form, or text data in strings read as though from files (e.g. use the text argument). b) You may benefit from using indexing by character strings rather than numeric index to avoid mixups in sequencing. c) I think Crop$ is a typo? d) The claim that it stops accepting the assigned values is more likely to be an error on your part than R being arbitrary. I may be going out on a limb here, but be sure you understand that modifications to data within functions are local, and you have to explicitly return changes if you want them to persist after the function returns. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Wageningen-eR igor.milosavlje...@wur.nl wrote: Hello, I am a Masters student and I am working on my thesis modelling smallholder farms using a program in R. I have modified the original code and I am having some issues with lists that I cannot figure out. Originally, I had list file defining lists such as: Param, Crop1, Crop1, Soil, etc. (ex. Param - list() ). Their subsets were listed as Crop1$CContent for example and there was quite a few of them. There is a run file that gets the list file going in the following code: #--Initialising the lists to store variable values tmp - init_lists() Param - tmp[[1]] Crop1 - tmp[[2]] Fert- tmp[[3]] Meteo - tmp[[4]] Soil- tmp[[5]] RainPattern - tmp[[6]] Crop2 - tmp[[7]] Cropres - tmp[[8]] rm(tmp) The problem here is that the lists get confused with each other; that is elements of Soil appear in Crop2. In the run file, I changed the order of the lists (for example Fert - tmp[[5]] and Crop2 -tmp[[3]]) and it changes which ones are confused with what; with some lists not being confused at all. I cannot find why this is a problem as the tmps are removed at the end of the command list shown below and each element is clearly defined as a part of its given list with (list name)$(Sublist maybe)$(specific element name) Either way, I changed the original lists to be more specific, for example: Crop$ - list( CContent - NA, NContent - NA, PContent - NA, KContent - NA, K - list( uptakegivenN, uptakegivenP), dryMatterContent ...etc. The issue then became that it stops at a certain point accepting the assigned values and keeps them as null. That is, the list contains an element that states Crop2$minYieldN - NA but the value becomes NULL when it is run. Also, the list, say Param, contains both: [[1]] [1] NA [[2]] [1] NA [[3]] [[3]][[1]] [1] NA [[3]][[2]] [1] NA and some of the value names (up to a point) $inertCRTR [1] 0.001 $humificationFactor [1] 0.25 $fractionStabilisedSOMC [1] 0.2 $growthEffectMicroorganisms [1] 0.6 Is it perhaps that the list is too long? Please let me know if anything comes to mind, it would be great help. Thank you! Regards, -Igor Milosavljevic Wageningen University and Research Center -- View this message in context: http://r.789695.n4.nabble.com/7-days-confusion-over-lists-tp4635489.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] reshape2 errors on data frame
I've successfully reformatted data frames from long to wide with reshape2, but this time I'm getting errors that I want to understand and resolve. Here's the data frame structure and the results of the melt() and dcast() functions: str(waterchem) 'data.frame': 128412 obs. of 8 variables: $ site: Factor w/ 64 levels D-1,D-2,D-3,..: 1 1 1 1 1 1 1 ... $ sampdate: Date, format: 2007-12-12 2007-12-12 ... $ preeq0 : logi TRUE TRUE TRUE TRUE TRUE TRUE ... $ param : Factor w/ 37 levels Ag,Al,Alk_tot,..: 1 2 8 17 3 4 ... $ quant : num 0.005 0.106 1 231 231 0.011 0.001 0.002 0.001 100 ... $ ceneq1 : logi TRUE FALSE TRUE FALSE FALSE FALSE ... $ floor : num 0 0.106 0 231 231 0.011 0 0 0 100 ... $ ceiling : num 0.005 0.106 1 231 231 0.011 0.001 0.002 0.001 100 ... chem.melt - melt(waterchem, idvars = c('site', 'sampdate', 'preeq0', 'param', 'ceneq1', 'floor', 'ceiling')) Using site, preeq0, param, ceneq1 as id variables chem.cast - dcast(chem.melt, site + sampdate + preeq0 + ceneq1 + floor + ceiling ~ param) Error in eval(expr, envir, enclos) : object 'sampdate' not found Because these data have some censored data there is the logical indicator (ceneq1) to identify censored and uncensored values the the interval ends (floor and ceiling) for multiply censored values. One factor of analytical interest whether the sampdate is pre- or post-event; that uses the logical indicator preeq0. Please point out where my melt() and dcast() syntax was incorrect. I will provide more information if needed. TIA, Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] empty cell when running GEE for binary data
Dear Yu, What do you mean exactly by exmpty cell ? But you can try, the following packages which might help you: gee and yags. But I am not sure. Best Ozgur -- View this message in context: http://r.789695.n4.nabble.com/empty-cell-when-running-GEE-for-binary-data-tp4635483p4635504.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fast trace of inverse
Suppose I have a square matrix A like the toy below set.seed(1234) (A - matrix(rnorm(9), nrow = 3)) I want to trace of the inverse of A. I could do sum(diag(solve(A))) [1] 1.259641 Or I could also do sum(1/eigen(A)$values) [1] 1.259641+0i Now, my actual problem involves a very large, square dense matrix. Is step 2 still the best R-ish way of doing this or is there a smarter and fast implementation for such cases? Harold [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Import single variables from SPSS
Hi Marion, package memisc does what you want, just have a look at ?importers. Hth Eik Am 05.07.2012 15:05, schrieb Marion Wenty: Dear all, I have got a very big SPSS dataframe and I would like to just import one or a few variables (the dataframe has got a lot of colums). I used the package foreign but I couldn't find anything in there for my problem. Thank you very much for your help in advance. Marion [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Eik Vettorazzi Department of Medical Biometry and Epidemiology University Medical Center Hamburg-Eppendorf Martinistr. 52 20246 Hamburg T ++49/40/7410-58243 F ++49/40/7410-57790 -- Pflichtangaben gemäß Gesetz über elektronische Handelsregister und Genossenschaftsregister sowie das Unternehmensregister (EHUG): Universitätsklinikum Hamburg-Eppendorf; Körperschaft des öffentlichen Rechts; Gerichtsstand: Hamburg Vorstandsmitglieder: Prof. Dr. Guido Sauter (Vertreter des Vorsitzenden), Dr. Alexander Kirstein, Joachim Prölß, Prof. Dr. Dr. Uwe Koch-Gromus __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RODBC tables
On Jul 4, 2012, at 10:10 AM, Lorcan Treanor wrote: Dear Sir/Madam, I am desperately in need of some help. I am trying to access tables from the oracle database and inserting them into R via a data frame and I keep getting an error saying that Error in .Call(C_RODBCFetchRows, attr(channel, handle_ptr), max, buffsize, : negative length vectors are not allowed. My connection is fine and my code for this is: check-odbcConnect(dsn=,uid=*,pwd=**) There are terms instead of the *'s but I am not sure if I should disclose them because this is work-related. I have looked all over the internet and tried hundreds of solutions but have had no luck. These are the code I tried to use to get the table i called ANYTHING into R for which the negative length vectors errors came up. sqlTables(check,schema=**) nowfetchmewillwheaton-sqlFetch(check,ANYTHING) Do the errors imply that my connection is wrong, my code is wrong or that the table is not in the correct place in oracle. Please let me know if you can help or if you can give me the email address of someone who can. Kind Regards, Lorcan Treanor You don't need to post the content of the *s. That is not needed, so you do not need to compromise your security. This issue has come up before and I don't know, other than in one case years ago, that a definitive solution was ever posted. It seemed to be otherwise associated with 64 bit installations and there were some hints of integer overflow or an ODBC driver stability issue, but I don't recall that being confirmed. In the one case where Prof. Ripley had posted a work-around, it was back in 2002 and involved querying tables with an underscore ('_') in the table name. I don't know that this issue is still relevant as I use RODBC on OSX (Fedora Linux for years previously) to access Oracle tables and views and they do have underscores in the names. Some comments: 1. You are better off subscribing to and posting this query to the r-sig-db list which is focused on R and databases. More info here: https://stat.ethz.ch/mailman/listinfo/r-sig-db Be sure to include information about your OS, R version including 32 or 64 bit. See the R Posting Guide for additional information on what default info to include in a post. 2. Any chance that you do not have appropriate permission to access the ANYTHING table? You may want to verify that with your DBAdmin. Is that actually the name of the table? 3. Try using: nowfetchmewilwheaton - sqlQuery(check, select * from ANYTHING) and see if that works. Regards, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reshape2 errors on data frame
It would be very helpful to have some sample data to play with. str() shows the structure of the data set but it is not the data. Something like dput(head(100)) would probably be enough. John Kane Kingston ON Canada -Original Message- From: rshep...@appl-ecosys.com Sent: Thu, 5 Jul 2012 08:46:31 -0700 (PDT) To: r-help@r-project.org Subject: [R] reshape2 errors on data frame I've successfully reformatted data frames from long to wide with reshape2, but this time I'm getting errors that I want to understand and resolve. Here's the data frame structure and the results of the melt() and dcast() functions: str(waterchem) 'data.frame': 128412 obs. of 8 variables: $ site: Factor w/ 64 levels D-1,D-2,D-3,..: 1 1 1 1 1 1 1 ... $ sampdate: Date, format: 2007-12-12 2007-12-12 ... $ preeq0 : logi TRUE TRUE TRUE TRUE TRUE TRUE ... $ param : Factor w/ 37 levels Ag,Al,Alk_tot,..: 1 2 8 17 3 4 ... $ quant : num 0.005 0.106 1 231 231 0.011 0.001 0.002 0.001 100 ... $ ceneq1 : logi TRUE FALSE TRUE FALSE FALSE FALSE ... $ floor : num 0 0.106 0 231 231 0.011 0 0 0 100 ... $ ceiling : num 0.005 0.106 1 231 231 0.011 0.001 0.002 0.001 100 ... chem.melt - melt(waterchem, idvars = c('site', 'sampdate', 'preeq0', 'param', 'ceneq1', 'floor', 'ceiling')) Using site, preeq0, param, ceneq1 as id variables chem.cast - dcast(chem.melt, site + sampdate + preeq0 + ceneq1 + floor + ceiling ~ param) Error in eval(expr, envir, enclos) : object 'sampdate' not found Because these data have some censored data there is the logical indicator (ceneq1) to identify censored and uncensored values the the interval ends (floor and ceiling) for multiply censored values. One factor of analytical interest whether the sampdate is pre- or post-event; that uses the logical indicator preeq0. Please point out where my melt() and dcast() syntax was incorrect. I will provide more information if needed. TIA, Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks orcas on your desktop! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] vector entry in matix
On Jul 5, 2012, at 11:22 AM, R. Michael Weylandt wrote: It is possible to put dimensionality on a list (i.e., a generic vector), which might be what you're looking for. x - list(1:4, letters[1:4], function(x,y) x + y, rnorm(50)) dim(x) - c(2,2) x[[1,2]] x[[2,2]] x[[3,2]] # Error That constructs a 2 x 2 matrix whose elements are lists: x [,1][,2] [1,] Integer,4 ? [2,] Character,4 Numeric,50 is.matrix(x) [1] TRUE I suppose one could take that approach to constructing an Excel-mimic, since the functional element survives that passage: x[1,2][[1]](3,4) [1] 7 Best, Michael On Thu, Jul 5, 2012 at 8:19 AM, Thomas C. thomas.csa...@gmail.com wrote: hi, i'm trying to figure out if there's any possibility to write a whole vector into a matrix or data.frame or something like that. i don't mean transormation. Here an example: [,1] [,2] [1,] a d [2,] b e [3,] c f where e.g. a is a-c(0,1) vector of length 2, b a vector of length 4,... (i know that the matrix enties are strings :)). i'm trying to put some variables into a matrix or data.frame, but i couldn't handle it. is something like that possible in R? The matrix classes objects loose their names, however. You will not have a,b, ... as handles in that matrix. You could create row and column names. thanks for your help! David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] vector entry in matix
well thank you, i think we get closer to my problem. but i meant it in a different way. maybe i describe what i intended to do: i have number of triangles which i'd like to store in a list, matrix,...etc. i thought it could look sth. like that: trianglenode1node2node3 . 1 c(x1,y1) c(x2,y3) c(x3,y3) 1 c(x4,y4) c(x5,y5) c(x6,y6) . . . i know that i could just make more rows for the x/y values but the problem is, that i need to store more than these properties (say the neighbours of these nodes and their coordinates) and i thought it would be more convenient to save these as vectors. it was just a thought... is this possible? -- View this message in context: http://r.789695.n4.nabble.com/vector-entry-in-matix-tp4635478p4635509.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] vector entry in matix
Hi, I think it is better to store vectors in to list when the vectors are of different lengths. #Consider these cases, #vectors of equal length set.seed(1) list1-list(a=rnorm(5,15),b=rnorm(5,25),c=1:5,d=runif(5,0.4),e=16:20,f=rep(1,5)) mat1-cbind(rbind(list1[[1]],list1[[2]],list1[[3]]),(rbind(list1[[4]],list1[[5]],list1[[6]]))) mat1 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [1,] 16.43302 16.98040 14.63278 13.95587 15.56972 0.5371949 0.7574272 [2,] 24.86495 27.40162 24.96076 25.68974 25.02800 16.000 17.000 [3,] 1.0 2.0 3.0 4.0 5.0 1.000 1.000 [,8] [,9] [,10] [1,] 0.7449233 0.4462386 0.4213243 [2,] 18.000 19.000 20.000 [3,] 1.000 1.000 1.000 dat1-data.frame(list1) dat1 a b c d e f 1 16.43302 24.86495 1 0.5371949 16 1 2 16.98040 27.40162 2 0.7574272 17 1 3 14.63278 24.96076 3 0.7449233 18 1 4 13.95587 25.68974 4 0.4462386 19 1 5 15.56972 25.02800 5 0.4213243 20 1 #vectors of uneual length #works with list list2-list(a=rnorm(5,15),b=rnorm(10,25),c=5:15,d=runif(3,0.4),e=1:5,f=rep(1:3,3)) list2 $a [1] 15.36594 15.24841 15.06529 15.01916 15.25734 $b [1] 24.35099 24.88083 25.66414 26.10097 25.14377 24.88225 24.08793 23.56241 [9] 24.20291 26.25408 $c [1] 5 6 7 8 9 10 11 12 13 14 15 $d [1] 0.8679909 0.9283714 0.6478745 $e [1] 1 2 3 4 5 $f [1] 1 2 3 1 2 3 1 2 3 #Warning or Errors in matrix and dataframe mat2-cbind(rbind(list2[[1]],list2[[2]],list2[[3]]),(rbind(list2[[4]],list2[[5]],list2[[6]]))) Warning messages: 1: In rbind(list2[[1]], list2[[2]], list2[[3]]) : number of columns of result is not a multiple of vector length (arg 1) 2: In rbind(list2[[4]], list2[[5]], list2[[6]]) : number of columns of result is not a multiple of vector length (arg 2) dat2-data.frame(list2) Error in data.frame(a = c(15.3659411230492, 15.2484126488726, 15.0652881816716, : arguments imply differing number of rows: 5, 10, 11, 3, 9 #So, it would be better to store it as a list. #But, you can still convert vectors of unequal lengths to data frame or matrix with NAs. library(zoo) a-zoo(1:8,1:14) b-zoo(5:8,3:5) c-zoo(1:5,2:5) list3-list(a,b,c) z-zoo() for(i in 1:length(list3)) z-merge(z,get(list3[[i]])) names(z)-unlist(list3) z a b c 1 1 NA NA 2 2 NA 1 3 3 5 2 4 4 6 3 5 5 7 4 6 6 NA NA 7 7 NA NA 8 8 NA NA 9 1 NA NA 10 2 NA NA 11 3 NA NA 12 4 NA NA 13 5 NA NA 14 6 NA NA z1-data.frame(z) is.data.frame(z1) [1] TRUE str(z1) 'data.frame': 14 obs. of 3 variables: $ a: int 1 2 3 4 5 6 7 8 1 2 ... $ b: int NA NA 5 6 7 NA NA NA NA NA ... $ c: int NA 1 2 3 4 NA NA NA NA NA ... z2-as.matrix(z) A.K. - Original Message - From: Thomas C. thomas.csa...@gmail.com To: r-help@r-project.org Cc: Sent: Thursday, July 5, 2012 9:19 AM Subject: [R] vector entry in matix hi, i'm trying to figure out if there's any possibility to write a whole vector into a matrix or data.frame or something like that. i don't mean transormation. Here an example: [,1] [,2] [1,] a d [2,] b e [3,] c f where e.g. a is a-c(0,1) vector of length 2, b a vector of length 4,... (i know that the matrix enties are strings :)). i'm trying to put some variables into a matrix or data.frame, but i couldn't handle it. is something like that possible in R? thanks for your help! -- View this message in context: http://r.789695.n4.nabble.com/vector-entry-in-matix-tp4635478.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] i need help in documentation
See also the inlinedocs package, which might be closer to what you're looking for. -- Bert On Thu, Jul 5, 2012 at 8:17 AM, R. Michael Weylandt michael.weyla...@gmail.com wrote: R does not, to my knowledge, have anything quite like Python's docstrings (which seems to be what you are talking about) but you might look at Roxygen2 on CRAN which should get you started in this direction. Best, Michael On Thu, Jul 5, 2012 at 7:37 AM, purushothaman purushothama...@ge.com wrote: Hi, i am new in R development , so i need help in documentation from comments in R file for example sample-function(filepath) { ##title read csv file ##author Purushoth ##description this function is used to read a csv file output-read.csv(filepath) return(output) } i need to get html document file using this comments Thanks B.Purushothaman -- View this message in context: http://r.789695.n4.nabble.com/i-need-help-in-documentation-tp4635471.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Exclude missing values on only 1 variable
Hello, I have many hundred variables in my longitudinal dataset and lots of missings. In order to plot data I need to remove missings. If I do data - na.omit(data) that will reduce my dataset to 2% of its original size ;) So I only need to listwise delete missings on 3 variables (the ones I am plotting). data$variable1 -na.omit(data$variable1) does not work. Thank you [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] NADA Data Frame Format: Wide or Long?
Rich, I am not familiar with the NADA package, but the reference manual ( http://cran.r-project.org/web/packages/NADA/NADA.pdf) has many examples using several data sets included in the package. Look up one of the functions that you plan to use, run the example in R, and look at the data that is used in the example to see how it is organized. Jean Rich Shepard rshep...@appl-ecosys.com wrote on 07/03/2012 11:57:30 AM: I have water chemistry data with censored values (i.e., those less than reporting levels) in a data frame with a narrow (i.e., database table) format. The structure is: $ site: Factor w/ 64 levels D-1,D-2,D-3,..: 1 1 1 1 1 1 1 1 ... $ sampdate: Date, format: 2007-12-12 2007-12-12 ... $ preeq0 : logi TRUE TRUE TRUE TRUE TRUE TRUE ... $ param : Factor w/ 37 levels Ag,Al,Alk_tot,..: 1 2 8 17 3 4 9 ... $ quant : num 0.005 0.106 1 231 231 0.011 0.001 0.002 0.001 100 ... $ ceneq1 : logi TRUE FALSE TRUE FALSE FALSE FALSE ... $ floor : num 0 0.106 0 231 231 0.011 0 0 0 100 ... $ ceiling : num 0.005 0.106 1 231 231 0.011 0.001 0.002 0.001 100 ... The logical 'preeq0' separates sampdate into two groups; 'ceneq1' indicates censored/uncensored values; 'floor' and 'ceiling' are the minima and maxima for censored values. The NADA package methods will be used, but I have not found information on whether this format or the wide (i.e., spreadsheet) format should be used. The NADA.pdf document doesn't tell me; at least, I haven't found the answer there. I can apply reshape2 to melt and re-cast the data in wide format if that's what is appropriate. Please provide a pointer to documents I can read for an answer to this and related questions. Rich [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] empty cell when running GEE for binary data
hi Ozgur, Thanks for your reply! by empty cell I mean there was a condition in which all participants showed the same response (let's say response A). In other words, the probability of A, according to the present data, was 100%; this led to the empty cell for non-A. The odds or p(A) would be 1/0, so it's not appropriate to use logit as the link function. I don't think, but not sure, gee would work, but i will try the other package you suggested! Thanks! Yue -- View this message in context: http://r.789695.n4.nabble.com/empty-cell-when-running-GEE-for-binary-data-tp4635483p4635513.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] vector entry in matix
You have failed to provide a complete, coherent description of what you wish to do. In the absence of such a description, all suggestions are just guesses. You need to think carefully about what information you want to associate with each triangle and the appropriate data structure to use to do this. You might do well to talk with a computer scientist about this. An object-oriented framework seems like the natural approach. This also sounds like the sort of thing Bioconductor might already have, so search around there, perhaps posting on their Help list. -- Bert On Thu, Jul 5, 2012 at 9:37 AM, Thomas C. thomas.csa...@gmail.com wrote: well thank you, i think we get closer to my problem. but i meant it in a different way. maybe i describe what i intended to do: i have number of triangles which i'd like to store in a list, matrix,...etc. i thought it could look sth. like that: trianglenode1node2node3 . 1 c(x1,y1) c(x2,y3) c(x3,y3) 1 c(x4,y4) c(x5,y5) c(x6,y6) . . . i know that i could just make more rows for the x/y values but the problem is, that i need to store more than these properties (say the neighbours of these nodes and their coordinates) and i thought it would be more convenient to save these as vectors. it was just a thought... is this possible? -- View this message in context: http://r.789695.n4.nabble.com/vector-entry-in-matix-tp4635478p4635509.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Exclude missing values on only 1 variable
On Jul 5, 2012, at 1:25 PM, Eiko Fried wrote: Hello, I have many hundred variables in my longitudinal dataset and lots of missings. In order to plot data I need to remove missings. If I do data - na.omit(data) that will reduce my dataset to 2% of its original size ;) So I only need to listwise delete missings on 3 variables (the ones I am plotting). data$variable1 -na.omit(data$variable1) ?complete.cases # returns a logical vector data[ complete.cases( data[ , c(var1, var2, var3]) , ] David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reshape2 errors on data frame
On Thu, 5 Jul 2012, John Kane wrote: It would be very helpful to have some sample data to play with. John, dput() doesn't want to cooperate so here are 100 rows from the raw data file that can be input with read.table(sample.dat, header = T, sep = |): 'site'|'sampdate'|'preeq0'|'param'|'quant'|'ceneq1'|'floor'|'ceiling' 'A'|'2007-12-12'|1|'Ag'|0.005|1|0|0.005 'A'|'2007-12-12'|1|'Al'|0.106|0|0.106|0.106 'A'|'2007-12-12'|1|'CO3'|1|1|0|1 'A'|'2007-12-12'|1|'HCO3'|231.000|0|231.000|231.000 'A'|'2007-12-12'|1|'Alk_tot'|231.000|0|231.000|231.000 'A'|'2007-12-12'|1|'As'|0.011|0|0.011|0.011 'A'|'2007-12-12'|1|'Ba'|0.001|1|0|0.001 'A'|'2007-12-12'|1|'Be'|0.002|1|0|0.002 'A'|'2007-12-12'|1|'Bi'|0.001|1|0|0.001 'A'|'2007-12-12'|1|'Ca'|100.000|0|100.000|100.000 'A'|'2007-12-12'|1|'Cd'|0.001|1|0|0.001 'A'|'2007-12-12'|1|'Cl'|1.430|0|1.430|1.430 'A'|'2007-12-12'|1|'Co'|0.001|1|0|0.001 'A'|'2007-12-12'|1|'Cr'|0.006|1|0|0.006 'A'|'2007-12-12'|1|'Cu'|0.024|0|0.024|0.024 'A'|'2007-12-12'|1|'DO'|4.960|0|4.960|4.960 'A'|'2007-12-12'|1|'Fe'|4.110|0|4.110|4.110 'A'|'2007-12-12'|1|'K'|0.001|1|0|0.001 'A'|'2007-12-12'|1|'Mg'|9.560|0|9.560|9.560 'A'|'2007-12-12'|1|'Mn'|0.035|0|0.035|0.035 'A'|'2007-12-12'|1|'Mo'|0.001|1|0|0.001 'A'|'2007-12-12'|1|'Na'|0.970|0|0.970|0.970 'A'|'2007-12-12'|1|'NH4'|0.01|1|0|0.01 'A'|'2007-12-12'|1|'NO3-NO2'|0.293|0|0.293|0.293 'A'|'2007-12-12'|1|'Pb'|0.025|0|0.025|0.025 'A'|'2007-12-12'|1|'pH'|7.800|0|7.800|7.800 'A'|'2007-12-12'|1|'Sb'|0.001|1|0|0.001 'A'|'2007-12-12'|1|'SC'|630.000|0|630.000|630.000 'A'|'2007-12-12'|1|'Se'|0.001|0|0.001|0.001 'A'|'2007-12-12'|1|'SO4'|65.800|0|65.800|65.800 'A'|'2007-12-12'|1|'Sr'|0.001|1|0|0.001 'A'|'2007-12-12'|1|'TDS'|320.000|0|320.000|320.000 'A'|'2007-12-12'|1|'Tl'|0.001|1|0|0.001 'A'|'2007-12-12'|1|'V'|0.001|1|0|0.001 'A'|'2007-12-12'|1|'Zn'|11.400|0|11.400|11.400 'A'|'2008-03-15'|1|'Ag'|0.005|1|0|0.005 'A'|'2008-03-15'|1|'Al'|0.08|1|0|0.08 'A'|'2008-03-15'|1|'CO3'|1|1|0|1 'A'|'2008-03-15'|1|'HCO3'|228.000|0|228.000|228.000 'A'|'2008-03-15'|1|'Alk_tot'|228.000|0|228.000|228.000 'A'|'2008-03-15'|1|'As'|0.001|0|0.001|0.001 'A'|'2008-03-15'|1|'Ba'|0.001|1|0|0.001 'A'|'2008-03-15'|1|'Be'|0.002|1|0|0.002 'A'|'2008-03-15'|1|'Bi'|0.001|1|0|0.001 'A'|'2008-03-15'|1|'Ca'|88.400|0|88.400|88.400 'A'|'2008-03-15'|1|'Cd'|0.001|1|0|0.001 'A'|'2008-03-15'|1|'Cl'|1.340|0|1.340|1.340 'A'|'2008-03-15'|1|'Co'|0.001|1|0|0.001 'A'|'2008-03-15'|1|'Cr'|0.006|1|0|0.006 'A'|'2008-03-15'|1|'Cu'|0.014|0|0.014|0.014 'A'|'2008-03-15'|1|'DO'|9.910|0|9.910|9.910 'A'|'2008-03-15'|1|'Fe'|0.309|0|0.309|0.309 'A'|'2008-03-15'|1|'Hg'|0.001|1|0|0.001 'A'|'2008-03-15'|1|'K'|0.001|1|0|0.001 'B-2'|'1992-02-29'|0|'Ag'|0.005|1|0|0.005 'B-2'|'1992-02-29'|0|'Al'|0.001|1|0|0.001 'B-2'|'1992-02-29'|0|'As'|0.001|1|0|0.001 'B-2'|'1992-02-29'|0|'Ba'|0.001|1|0|0.001 'B-2'|'1992-02-29'|0|'Be'|0.001|1|0|0.001 'B-2'|'1992-02-29'|0|'Bi'|0.001|1|0|0.001 'B-2'|'1992-02-29'|0|'Ca'|0.001|1|0|0.001 'B-2'|'1992-02-29'|0|'Cd'|0.001|1|0|0.001 'B-2'|'1992-02-29'|0|'Cl'|0.001|1|0|0.001 'B-2'|'1992-02-29'|0|'Co'|0.001|1|0|0.001 'B-2'|'1992-02-29'|0|'Cr'|0.03|1|0|0.03 'B-2'|'1992-02-29'|0|'Cu'|0.001|1|0|0.001 'B-2'|'1992-02-29'|0|'Fe'|1.2|1|0|1.2 'B-2'|'1992-02-29'|0|'Hg'|0.001|1|0|0.001 'B-2'|'1992-02-29'|0|'K'|19.800|0|19.800|19.800 'B-2'|'1992-02-29'|0|'Mg'|1|1|0|1 'B-2'|'1992-02-29'|0|'Mn'|0.001|1|0|0.001 'B-2'|'1992-02-29'|0|'Mo'|0.001|1|0|0.001 'B-2'|'1992-02-29'|0|'Na'|0.03|1|0|0.03 'B-2'|'1992-02-29'|0|'NH4'|0.001|1|0|0.001 'B-2'|'1992-02-29'|0|'Pb'|0.025|0|0.025|0.025 'B-2'|'1992-02-29'|0|'pH'|8.480|0|8.480|8.480 'B-2'|'1992-02-29'|0|'Sb'|0.01|1|0|0.01 'B-2'|'1992-02-29'|0|'Se'|0.003|1|0|0.003 'B-2'|'1992-02-29'|0|'Sr'|0.001|1|0|0.001 'B-2'|'1992-02-29'|0|'TDS'|15|1|0|15 'B-2'|'1992-02-29'|0|'Tl'|0.001|1|0|0.001 'B-2'|'1992-02-29'|0|'V'|0.001|1|0|0.001 'B-2'|'1992-02-29'|0|'Zn'|0.001|1|0|0.001 'B-2'|'1992-04-02'|0|'Ag'|0.005|1|0|0.005 'B-2'|'1992-04-02'|0|'Al'|0.001|1|0|0.001 'B-2'|'1992-04-02'|0|'As'|0.001|1|0|0.001 'B-2'|'1992-04-02'|0|'Ba'|0.001|1|0|0.001 'B-2'|'1992-04-02'|0|'Be'|0.001|1|0|0.001 'B-2'|'1992-04-02'|0|'Bi'|0.001|1|0|0.001 'B-2'|'1992-04-02'|0|'Ca'|0.001|1|0|0.001 'B-2'|'1992-04-02'|0|'Cd'|0.001|1|0|0.001 'B-2'|'1992-04-02'|0|'Cl'|0.001|1|0|0.001 'B-2'|'1992-04-02'|0|'Co'|0.001|1|0|0.001 'B-2'|'1992-04-02'|0|'Cr'|0.03|1|0|0.03 'B-2'|'1992-04-02'|0|'Cu'|0.001|1|0|0.001 'B-2'|'1992-04-02'|0|'Fe'|1.2|1|0|1.2 'B-2'|'1992-04-02'|0|'Hg'|0.001|1|0|0.001 'B-2'|'1992-04-02'|0|'K'|0.001|1|0|0.001 'B-2'|'1992-04-02'|0|'Mg'|1|1|0|1 'B-2'|'1992-04-02'|0|'Mn'|0.001|1|0|0.001 Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] a problem about WLS
thanks a lot, it works On Thu, Jul 5, 2012 at 11:40 AM, Thomas Lumley-2 [via R] ml-node+s789695n4635453...@n4.nabble.com wrote: On Thu, Jul 5, 2012 at 11:40 AM, jacquesliu [hidden email]http://user/SendEmail.jtp?type=nodenode=4635453i=0 wrote: I was asked to do a WLS estimation, so I thought of lm() with weights like wls=lm(Y~X-1,weight=INC) however, it gives different result as below code, which use the formula of WLS y-Y*INC^-0.5 x-X*INC^-0.5 wls=lm(y~x-1) Can anybody explain to me why the first code can not give the right answer? The two examples are using the opposite weights. To replicate the answer from the second approach with lm, use weight=INC^-1 In lm(), observations with high weights have more, um, weight. They influence the results more than observations with low weight. Your code does the opposite. -thomas -- Thomas Lumley Professor of Biostatistics University of Auckland __ [hidden email] http://user/SendEmail.jtp?type=nodenode=4635453i=1mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/a-problem-about-WLS-tp4635446p4635453.html To unsubscribe from a problem about WLS, click herehttp://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4635446code=amFjcXVlc2xpdUBnbWFpbC5jb218NDYzNTQ0NnwxNTkzNzg0MjQx . NAMLhttp://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml -- Liu,Jacques -- View this message in context: http://r.789695.n4.nabble.com/a-problem-about-WLS-tp4635446p4635523.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reshape2 errors on data frame
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Rich Shepard Sent: Thursday, July 05, 2012 10:36 AM To: r-help@r-project.org Subject: Re: [R] reshape2 errors on data frame On Thu, 5 Jul 2012, John Kane wrote: It would be very helpful to have some sample data to play with. John, dput() doesn't want to cooperate so here are 100 rows from the raw data file that can be input with read.table(sample.dat, header = T, sep = |): 'site'|'sampdate'|'preeq0'|'param'|'quant'|'ceneq1'|'floor'|'ceiling' 'A'|'2007-12-12'|1|'Ag'|0.005|1|0|0.005 snip 'B-2'|'1992-04-02'|0|'Mn'|0.001|1|0|0.001 Rich Rich, The melt syntax you provided in a previous email runs without error on the data that you just supplied. So, your comment that dput() doesn't want to cooperate suggests that maybe there is a problem with your waterchem data frame. Maybe someone else will have more insight into the problem. Sorry I can't be of more help, Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Removing rows if certain elements are found in character string
Perhaps I've missed something, but if it's really true that the goal is to remove rows if the first non-zero element is D or d, then how about this: tmp - gsub('0','',df$ch) first - substr(tmp,1,1) subset(df, tolower(first) != 'd') and of course it could be rolled up into a single expression, but I wrote it in several steps to make it easy to follow. No need to wrap one's brain around regular expressions (which is hard for me!) -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 7/2/12 3:48 PM, Claudia Penaloza claudiapenal...@gmail.com wrote: I would like to remove rows from the following data frame (df) if there are only two specific elements found in the df$ch character string (I want to remove rows with only 0 D or 0 d). Alternatively, I would like to remove rows if the first non-zero element is D or d. ch count 1 00D0 0.007368; 2 00d0 0.002456; 3 0T00 0.007368; 4 0TD0 0.007368; 5 0T00 0.002456; 6 0Td0 0.002456; 7 T000 0.007368; 8 T0D0 0.007368; 9 T000 0.002456; 10 T0d0 0.002456; I tried the following but it doesn't work if there is more than one character per string: df - df[!df$ch %in% c(0,D),] df - df[!df$ch %in% c(0,d),] Any help greatly appreciated, Claudia [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Exclude missing values on only 1 variable
Be careful!! The plots could be potentially misleading. The problem is the nature of the missingness. The approach you are taking is based on assuming MCAR missingness (look it up, if necessary). If that is not the case -- e.g. if there is censoring, MAR, or informative missingness -- the plots may be completely misleading. Missingness in longitudinal data is a very difficult issue. If this is something you don't know about already, I strongly suggest that you consult a statistician who does -- not all of us do (I know almost nothing, for example). -- Bert On Thu, Jul 5, 2012 at 10:34 AM, David Winsemius dwinsem...@comcast.netwrote: On Jul 5, 2012, at 1:25 PM, Eiko Fried wrote: Hello, I have many hundred variables in my longitudinal dataset and lots of missings. In order to plot data I need to remove missings. If I do data - na.omit(data) that will reduce my dataset to 2% of its original size ;) So I only need to listwise delete missings on 3 variables (the ones I am plotting). data$variable1 -na.omit(data$variable1) ?complete.cases # returns a logical vector data[ complete.cases( data[ , c(var1, var2, var3]) , ] David Winsemius, MD West Hartford, CT __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reshape2 errors on data frame
On Thu, 5 Jul 2012, Nordlund, Dan (DSHS/RDA) wrote: The melt syntax you provided in a previous email runs without error on the data that you just supplied. So, your comment that dput() doesn't want to cooperate suggests that maybe there is a problem with your waterchem data frame. Maybe someone else will have more insight into the problem. Dan, Huh! Isn't that interesting? I'll re-read the data into a data frame and see if that makes a difference. I wonder if the issue is with the logical columns; I've not used them before, only dates, factors, and numerals. I'll try without the logicals and see if there's a difference. Thanks for the insight, Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reshape2 errors on data frame
On Thu, 5 Jul 2012, Rich Shepard wrote: I wonder if the issue is with the logical columns; I've not used them before, only dates, factors, and numerals. I'll try without the logicals and see if there's a difference. Nope. Same problem as before. Someone, please provide a process I can apply to figure out the error in the data frame. Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reshape2 errors on data frame
On Jul 5, 2012, at 1:28 PM, Rich Shepard rshep...@appl-ecosys.com wrote: On Thu, 5 Jul 2012, Rich Shepard wrote: I wonder if the issue is with the logical columns; I've not used them before, only dates, factors, and numerals. I'll try without the logicals and see if there's a difference. Nope. Same problem as before. Someone, please provide a process I can apply to figure out the error in the data frame. What do you mean it won't cooperate? Error message? Michael Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reshape2 errors on data frame
Hello, Just to give it a try, you've written 'idvars' when it's 'id.vars'. After correction I no longer have the error. But, as expected, the result 'chem.cast' has lots and lots of NAs. chem.cast - dcast(chem.melt, site + sampdate + preeq0 + ceneq1 + floor + ceiling ~ param) # First 6 rows head(chem.cast) site sampdate preeq0 ceneq1 floor ceiling AgAl Alk_totAs Ba Be Bi Ca Cd 1A 2007-12-12 TRUE FALSE 0.001 0.001 NANA NANA NA NA NA NA NA 2A 2007-12-12 TRUE FALSE 0.011 0.011 NANA NA 0.011 NA NA NA NA NA 3A 2007-12-12 TRUE FALSE 0.024 0.024 NANA NANA NA NA NA NA NA 4A 2007-12-12 TRUE FALSE 0.025 0.025 NANA NANA NA NA NA NA NA 5A 2007-12-12 TRUE FALSE 0.035 0.035 NANA NANA NA NA NA NA NA 6A 2007-12-12 TRUE FALSE 0.106 0.106 NA 0.106 NANA NA NA NA NA NA Cl Co CO3 CrCu DO Fe HCO3 Hg K MgMn Mo Na NH4 NO3-NO2Pb pH Sb SCSe 1 NA NA NA NANA NA NA NA NA NA NANA NA NA NA NANA NA NA NA 0.001 2 NA NA NA NANA NA NA NA NA NA NANA NA NA NA NANA NA NA NANA 3 NA NA NA NA 0.024 NA NA NA NA NA NANA NA NA NA NANA NA NA NANA 4 NA NA NA NANA NA NA NA NA NA NANA NA NA NA NA 0.025 NA NA NANA 5 NA NA NA NANA NA NA NA NA NA NA 0.035 NA NA NA NANA NA NA NANA 6 NA NA NA NANA NA NA NA NA NA NANA NA NA NA NANA NA NA NANA SO4 Sr TDS Tl V Zn 1 NA NA NA NA NA NA 2 NA NA NA NA NA NA 3 NA NA NA NA NA NA 4 NA NA NA NA NA NA 5 NA NA NA NA NA NA 6 NA NA NA NA NA NA # Now the last 6 rows tail(chem.cast) site sampdate preeq0 ceneq1 floor ceilingAgAl Alk_tot AsBaBe 49 B-2 1992-02-29 FALSE TRUE 0 15.000NANA NANA NANA 50 B-2 1992-04-02 FALSE TRUE 0 0.001NA 0.001 NA 0.001 0.001 0.001 51 B-2 1992-04-02 FALSE TRUE 0 0.005 0.005NA NANA NANA 52 B-2 1992-04-02 FALSE TRUE 0 0.030NANA NANA NANA 53 B-2 1992-04-02 FALSE TRUE 0 1.000NANA NANA NANA 54 B-2 1992-04-02 FALSE TRUE 0 1.200NANA NANA NANA BiCaCdClCo CO3 CrCu DO Fe HCO3Hg K MgMn Mo 49NANANANANA NA NANA NA NA NANANA NANA NA 50 0.001 0.001 0.001 0.001 0.001 NA NA 0.001 NA NA NA 0.001 0.001 NA 0.001 NA 51NANANANANA NA NANA NA NA NANANA NANA NA 52NANANANANA NA 0.03NA NA NA NANANA NANA NA 53NANANANANA NA NANA NA NA NANANA 1NA NA 54NANANANANA NA NANA NA 1.2 NANANA NANA NA Na NH4 NO3-NO2 Pb pH Sb SC Se SO4 Sr TDS Tl V Zn 49 NA NA NA NA NA NA NA NA NA NA 15 NA NA NA 50 NA NA NA NA NA NA NA NA NA NA NA NA NA NA 51 NA NA NA NA NA NA NA NA NA NA NA NA NA NA 52 NA NA NA NA NA NA NA NA NA NA NA NA NA NA 53 NA NA NA NA NA NA NA NA NA NA NA NA NA NA 54 NA NA NA NA NA NA NA NA NA NA NA NA NA NA Hope this helps, Rui Barradas Em 05-07-2012 19:28, Rich Shepard escreveu: On Thu, 5 Jul 2012, Rich Shepard wrote: I wonder if the issue is with the logical columns; I've not used them before, only dates, factors, and numerals. I'll try without the logicals and see if there's a difference. Nope. Same problem as before. Someone, please provide a process I can apply to figure out the error in the data frame. Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reshape2 errors on data frame
On Thu, 5 Jul 2012, R. Michael Weylandt michael.weyla...@gmail.com wrote: What do you mean it won't cooperate? Error message? Michael, I get the command echoed rather than results. Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reshape2 errors on data frame
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Rich Shepard Sent: Thursday, July 05, 2012 11:29 AM To: r-help@r-project.org Subject: Re: [R] reshape2 errors on data frame On Thu, 5 Jul 2012, Rich Shepard wrote: I wonder if the issue is with the logical columns; I've not used them before, only dates, factors, and numerals. I'll try without the logicals and see if there's a difference. Nope. Same problem as before. Someone, please provide a process I can apply to figure out the error in the data frame. Rich Rich, I apologize that I did not read your original post carefully enough. I missed that your problem was with the dcast() function. I am not very familiar with the reshape package, but maybe I can give you some ideas of how to proceed. When I ran your melt syntax there was no error message, however there was a messages saying Using site, sampdate, param as id variables. That is, not all of the ID variables specified were used. So the chem.melt data frame did not have all the ID variables in it. Therefore, when I tried to run the dcast() syntax that you gave, it could not find some of variables specified. In your case, the error message stated that it could not find sampdate. If you run str(chem.melt) do you see the variable sampdate? I am not sure how you are wanting to re-shape your data, but when you specify a variable in dcast(), it needs to be in the dataframe you are working with. Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Silencing the output of install.packages()
Is there a way to suppress the output of 'install.packages()'? I have seen that the 'download.file' function has a 'quiet' option but I do not know how to use it. I do not see any good reason to allow that. A user should see if software is being installed. Hi Uwe, I have a proposed use-case. We run a series of unit tests in R, and want to plug them into various continuous build/test frameworks. Part of our system involves being able to install packages, so we need to actually run a unit test that uses `install.packages`. For better or worse, many of these depend on parsing the output of a test run; it's a lot of silly work to make them parse the additional `install.packages` output (as opposed to the much simpler `[\w]+: \.+` output from something like testthat in the case of success). Two questions: * Is there another workaround? I'm planning on some ugly hackery on our side to deal with this, but would love something cleaner. * If I were willing to write the patch, would turning these statements into something we could suppress be acceptable for R? Thanks! -cc -- View this message in context: http://r.789695.n4.nabble.com/Silencing-the-output-of-install-packages-tp4631947p4635525.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Confused about multiple imputation with rms or Hmisc packages
Hello, I'm working on a Cox Proportional Hazards model for a cancer data set that has missing values for the categorical variable Grade in less than 10% of the observations. I'm not a statistician, but based on my readings of Frank Harrell's book it seems to be a candidate for using multiple imputation technique(s). I understand the concepts behind imputation, but using the functions in rms and Hmisc is confounding me. For instance, whether to use transcan or aregImpute. Here is a sample of my data: https://dl.dropbox.com/u/1852742/sample.csv Drawing from Chapter 8 of Harrell's book, this is what I've been toying with: #recurfree_survival_fromsx is survival time, rf_obs_sx codes for events as a binary variable. #The CPH model I would like to fit, using Ograde_dx as the variable for overall grade at #diagnosis, ord_nodes as an ordinal variable for the # lymph nodes involved. obj=with(mydata, Surv(recurfree_survival_fromsx,rf_obs_sx)) mod=cph(obj~ord_nodes+Ograde_dx+ERorPR+HER2_Sum,data=mydata,x=T,y=T) #Impute missing data mydata.transcan=transcan(~Ograde_dx+tumorsize+ord_nodes+simp_stage_path+afam+ Menopause+Age,imputed=T,n.impute=10) summary(mydata.transcan) The issues I have are: a) In your opinion(s), should I even be imputing this data? Is it appropriate here? b) Even after reading the help pages and Harrell's book, I'm not sure I used the correct imputation method, and whether I should be using transcan or aregImpute. c) In the output of summary(transcan), is R-squared the best value to describe how reliably the function could predict Ograde_dx? What is an acceptable level? d) Do I use the function fit.mult.impute to fit my final cph model? I appreciate your help with this as it is a somewhat confusing topic. I hope I gave you all the information you need to answer my questions. Sincerely, Jahan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Conditional Logistic regression with random effects / 2 random effects logit models
Dear All: I am stuck on this problem and I could use some help from folks out there. I want to build a conditional logistic model with random effects or logistic regression with 2 random effects. I have data on mothers and their kids. I want to condition 1 mom predictor(0/1) and also for sibling clustering (multiple kids per mom). I would appreciate any info on a package to use, and a help on writing the codes. Variables : MomID as random 1 , momcondpredictor(0/1) as random 2 , mommainpredictor + covariate kidoutcome(0/1) regressed mompredictor1 + covariates + the 2 random effects. Many thanks for your help. YA -- View this message in context: http://r.789695.n4.nabble.com/Conditional-Logistic-regression-with-random-effects-2-random-effects-logit-models-tp4635528.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Histogram
I have a column of 1000 datapoints from the normal distribution with mean 2 and variance 4. How can I get a histogram of these observations with 20 bins with each bin having 50 observations? -- Thanks, Jim. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Exclude missing values on only 1 variable
HI, Try this: set.seed(1) dat1-data.frame(var1=c(rep(NA,3),1:3),var2=c(1:4,NA,5),var3=c(1:5,NA),var4=rnorm(6,15),var5=runif(6,0.2),var6=rep(NA,6)) dat1[rowSums(is.na(dat1[,c(var1,var2,var3)]))==0,] var1 var2 var3 var4 var5 var6 4 1 4 4 16.59528 0.5981594 NA A.K. - Original Message - From: Eiko Fried tor...@gmail.com To: r-help@r-project.org Cc: Sent: Thursday, July 5, 2012 1:25 PM Subject: [R] Exclude missing values on only 1 variable Hello, I have many hundred variables in my longitudinal dataset and lots of missings. In order to plot data I need to remove missings. If I do data - na.omit(data) that will reduce my dataset to 2% of its original size ;) So I only need to listwise delete missings on 3 variables (the ones I am plotting). data$variable1 -na.omit(data$variable1) does not work. Thank you [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hi,Try this: __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fw: Exclude missing values on only 1 variable
HI, Try this: set.seed(1) dat1-data.frame(var1=c(rep(NA,3),1:3),var2=c(1:4,NA,5),var3=c(1:5,NA),var4=rnorm(6,15),var5=runif(6,0.2),var6=rep(NA,6)) dat1[rowSums(is.na(dat1[,c(var1,var2,var3)]))==0,] var1 var2 var3 var4 var5 var6 4 1 4 4 16.59528 0.5981594 NA A.K. - Original Message - From: Eiko Fried tor...@gmail.com To: r-help@r-project.org Cc: Sent: Thursday, July 5, 2012 1:25 PM Subject: [R] Exclude missing values on only 1 variable Hello, I have many hundred variables in my longitudinal dataset and lots of missings. In order to plot data I need to remove missings. If I do data - na.omit(data) that will reduce my dataset to 2% of its original size ;) So I only need to listwise delete missings on 3 variables (the ones I am plotting). data$variable1 -na.omit(data$variable1) does not work. Thank you [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hi,Try this: __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] GEE with Inverse Probability Weights
Thanks -- extremely helpful. But what is the mechanism by which this analysis corrects for the fact that my subjects are clustered (twins)? -- View this message in context: http://r.789695.n4.nabble.com/GEE-with-Inverse-Probability-Weights-tp4633172p4635533.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] colored nodes in dendrogram
Dear list, is there a way how to add information to internal nodes (branching points) in dendrogram created via plot.agnes function (package cluster)? I wish to place colored circles on the nodes, but I don't know how to proceed... I'll be grateful for any suggestion Ondřej -- Ondřej Mikula Institute of Animal Physiology and Genetics Academy of Sciences of the Czech Republic Veveri 97, 60200 Brno, Czech Republic Institute of Vertebrate Biology Academy of Sciences of the Czech Republic Studenec 122, 67502 Konesin, Czech Republic __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Histogram
Hi Jim, You can't specify both number of bins and bin size. You can specify breaks: either the number of bins or the location of breakpoints. A histogram with 20 bins of 50 observations each must by definition come from a uniform distribution. What are you trying to accomplish? Sarah On Thu, Jul 5, 2012 at 3:29 PM, Jim Silverton jim.silver...@gmail.com wrote: I have a column of 1000 datapoints from the normal distribution with mean 2 and variance 4. How can I get a histogram of these observations with 20 bins with each bin having 50 observations? -- Thanks, Jim. -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reshape2 errors on data frame
Hello, Inline Em 05-07-2012 20:21, Nordlund, Dan (DSHS/RDA) escreveu: -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Rich Shepard Sent: Thursday, July 05, 2012 11:29 AM To: r-help@r-project.org Subject: Re: [R] reshape2 errors on data frame On Thu, 5 Jul 2012, Rich Shepard wrote: I wonder if the issue is with the logical columns; I've not used them before, only dates, factors, and numerals. I'll try without the logicals and see if there's a difference. Nope. Same problem as before. Someone, please provide a process I can apply to figure out the error in the data frame. Rich Rich, I apologize that I did not read your original post carefully enough. I missed that your problem was with the dcast() function. I am not very familiar with the reshape package, but maybe I can give you some ideas of how to proceed. When I ran your melt syntax there was no error message, however there was a messages saying Using site, sampdate, param as id variables. That is, not all of the ID variables specified were used. So the chem.melt data frame did not have all the ID variables in it. Therefore, when I tried to run the dcast() syntax that you gave, it could not find some of variables specified. In your case, the error message stated that it could not find sampdate. If you run I should have said that in my earlier post. It's implicit in the misspeling of the argument 'id.vars', an argument of melt() not of dcast(). Therefore, melt() issues a warning, the warning in the OP. I'll reproduce it here: chem.melt - melt(waterchem, + idvars = c('site', 'sampdate', 'preeq0', 'param', 'ceneq1', 'floor', 'ceiling')) Using site, preeq0, param, ceneq1 as id variables Hope this helps, Rui Barradas str(chem.melt) do you see the variable sampdate? I am not sure how you are wanting to re-shape your data, but when you specify a variable in dcast(), it needs to be in the dataframe you are working with. Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Exclude missing values on only 1 variable
On Jul 5, 2012, at 1:34 PM, David Winsemius wrote: On Jul 5, 2012, at 1:25 PM, Eiko Fried wrote: Hello, I have many hundred variables in my longitudinal dataset and lots of missings. In order to plot data I need to remove missings. If I do data - na.omit(data) that will reduce my dataset to 2% of its original size ;) So I only need to listwise delete missings on 3 variables (the ones I am plotting). data$variable1 -na.omit(data$variable1) ?complete.cases # returns a logical vector data[ complete.cases( data[ , c(var1, var2, var3]) , ] data[ complete.cases( data[ , c(var1, var2, var3) ] ) , ] Er, need to add a right - paren -here ---^ -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Histogram
Hello, Try the following. x - rnorm(1000, mean=2, sd=2) p - seq(0, 1, by=1/20) quant - quantile(x, probs=p) hist(x, breaks=quant) The method is absolutely general, that's why I've separated the several steps, to make it clear. Hope this helps, Rui Barradas Em 05-07-2012 20:29, Jim Silverton escreveu: I have a column of 1000 datapoints from the normal distribution with mean 2 and variance 4. How can I get a histogram of these observations with 20 bins with each bin having 50 observations? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] GEE with Inverse Probability Weights
Hi Frank, It clusters by twin, that is why in Dr. Lumley's example, the id was twin pair, not individual, and the SE is adjusted accordingly. Cheers, Josh On Thu, Jul 5, 2012 at 12:10 PM, RFrank spark...@gmail.com wrote: Thanks -- extremely helpful. But what is the mechanism by which this analysis corrects for the fact that my subjects are clustered (twins)? -- View this message in context: http://r.789695.n4.nabble.com/GEE-with-Inverse-Probability-Weights-tp4633172p4635533.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Histogram
There's no reason you can't do that with normally-distributed data, though I'm not sure why you'd want to. My point was rather that you can't specify the bin width and size both. If you let the bin size vary, this will work: set.seed(1234) mydata - rnorm(1000, mean = 2, sd = 4) mydata.hist - hist(mydata, breaks=quantile(mydata, probs=seq(0, 1, length.out = length(mydata)/50 + 1))) mydata.hist$counts Sarah On Thu, Jul 5, 2012 at 3:37 PM, Jim Silverton jim.silver...@gmail.com wrote: Thanks Sarah!! Ok so if I have say x = runif(1000,0,1) say instead if the normal and I want a histogram with bins that have an equal number of observations. For example if I want each bin to have 50 observations, how do I do this? On Thu, Jul 5, 2012 at 3:34 PM, Sarah Goslee sarah.gos...@gmail.com wrote: Hi Jim, You can't specify both number of bins and bin size. You can specify breaks: either the number of bins or the location of breakpoints. A histogram with 20 bins of 50 observations each must by definition come from a uniform distribution. What are you trying to accomplish? Sarah On Thu, Jul 5, 2012 at 3:29 PM, Jim Silverton jim.silver...@gmail.com wrote: I have a column of 1000 datapoints from the normal distribution with mean 2 and variance 4. How can I get a histogram of these observations with 20 bins with each bin having 50 observations? -- Thanks, Jim. -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Histogram
On 05/07/2012 3:34 PM, Sarah Goslee wrote: Hi Jim, You can't specify both number of bins and bin size. You can specify breaks: either the number of bins or the location of breakpoints. A histogram with 20 bins of 50 observations each must by definition come from a uniform distribution. Only if the bins are equal in size. You could set the bins to quantiles, for the following ugly plot: hist(x, breaks=quantile(x, seq(0, 1, by=0.05))) Duncan Murdoch What are you trying to accomplish? Sarah On Thu, Jul 5, 2012 at 3:29 PM, Jim Silverton jim.silver...@gmail.com wrote: I have a column of 1000 datapoints from the normal distribution with mean 2 and variance 4. How can I get a histogram of these observations with 20 bins with each bin having 50 observations? -- Thanks, Jim. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Histogram
Hello, With the confusion between bin size and width the OP started, I'll repost my answer with a final line. Sorry for the repetition. h - hist(x, breaks=quantile(x, probs=seq(0, 1, by=1/20))) h$counts [1] 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 Hope this helps, Rui Barradas Em 05-07-2012 20:47, Sarah Goslee escreveu: There's no reason you can't do that with normally-distributed data, though I'm not sure why you'd want to. My point was rather that you can't specify the bin width and size both. If you let the bin size vary, this will work: set.seed(1234) mydata - rnorm(1000, mean = 2, sd = 4) mydata.hist - hist(mydata, breaks=quantile(mydata, probs=seq(0, 1, length.out = length(mydata)/50 + 1))) mydata.hist$counts Sarah On Thu, Jul 5, 2012 at 3:37 PM, Jim Silverton jim.silver...@gmail.com wrote: Thanks Sarah!! Ok so if I have say x = runif(1000,0,1) say instead if the normal and I want a histogram with bins that have an equal number of observations. For example if I want each bin to have 50 observations, how do I do this? On Thu, Jul 5, 2012 at 3:34 PM, Sarah Goslee sarah.gos...@gmail.com wrote: Hi Jim, You can't specify both number of bins and bin size. You can specify breaks: either the number of bins or the location of breakpoints. A histogram with 20 bins of 50 observations each must by definition come from a uniform distribution. What are you trying to accomplish? Sarah On Thu, Jul 5, 2012 at 3:29 PM, Jim Silverton jim.silver...@gmail.com wrote: I have a column of 1000 datapoints from the normal distribution with mean 2 and variance 4. How can I get a histogram of these observations with 20 bins with each bin having 50 observations? -- Thanks, Jim. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to do non-parametric Anova for comparing two lm models?
I am trying to compare two lm models, one including a categorical variable and the other excluding the variable... The residuals of both lm fits are highly non-Gaussian. Could you please give me some pointers? Thank you! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] NADA Data Frame Format: Wide or Long?
I haven't used NADA functions in quite a while, but from what I recall, you will likely be using the narrow format, and sub-setting as needed for the different analytes. As Jean suggested, the examples in the help pages for the NADA function(s) of interest should make it clear. This example follows exactly the example in ?cenros. with( subset(yourdataframe, param=='Ag'), cenros(quant,ceneq1) ) This should do a simple censored summary statistica calculation for silver (assuming quant contains your reporting level for censored results, which appears to be the case). I'd also suggest you try to load your data so that site and param are not factors, though this could depend on your ultimate analysis. -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 7/3/12 9:57 AM, Rich Shepard rshep...@appl-ecosys.com wrote: I have water chemistry data with censored values (i.e., those less than reporting levels) in a data frame with a narrow (i.e., database table) format. The structure is: $ site: Factor w/ 64 levels D-1,D-2,D-3,..: 1 1 1 1 1 1 1 1 ... $ sampdate: Date, format: 2007-12-12 2007-12-12 ... $ preeq0 : logi TRUE TRUE TRUE TRUE TRUE TRUE ... $ param : Factor w/ 37 levels Ag,Al,Alk_tot,..: 1 2 8 17 3 4 9 ... $ quant : num 0.005 0.106 1 231 231 0.011 0.001 0.002 0.001 100 ... $ ceneq1 : logi TRUE FALSE TRUE FALSE FALSE FALSE ... $ floor : num 0 0.106 0 231 231 0.011 0 0 0 100 ... $ ceiling : num 0.005 0.106 1 231 231 0.011 0.001 0.002 0.001 100 ... The logical 'preeq0' separates sampdate into two groups; 'ceneq1' indicates censored/uncensored values; 'floor' and 'ceiling' are the minima and maxima for censored values. The NADA package methods will be used, but I have not found information on whether this format or the wide (i.e., spreadsheet) format should be used. The NADA.pdf document doesn't tell me; at least, I haven't found the answer there. I can apply reshape2 to melt and re-cast the data in wide format if that's what is appropriate. Please provide a pointer to documents I can read for an answer to this and related questions. Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reshape2 errors on data frame [RESOLVED]
On Thu, 5 Jul 2012, Rui Barradas wrote: Just to give it a try, you've written 'idvars' when it's 'id.vars'. Rui, Oops! That was the problem. No matter how often I looked at the syntax I kept missing that. Hope this helps, Most definitely! Thanks for spotting it, Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Histogram
Which gives Jim two ways to arrive at exactly the same result, just different means of specifying the probs for quantile(). Sarah On Thu, Jul 5, 2012 at 4:01 PM, Rui Barradas ruipbarra...@sapo.pt wrote: Hello, With the confusion between bin size and width the OP started, I'll repost my answer with a final line. Sorry for the repetition. h - hist(x, breaks=quantile(x, probs=seq(0, 1, by=1/20))) h$counts [1] 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 Hope this helps, Rui Barradas Em 05-07-2012 20:47, Sarah Goslee escreveu: There's no reason you can't do that with normally-distributed data, though I'm not sure why you'd want to. My point was rather that you can't specify the bin width and size both. If you let the bin size vary, this will work: set.seed(1234) mydata - rnorm(1000, mean = 2, sd = 4) mydata.hist - hist(mydata, breaks=quantile(mydata, probs=seq(0, 1, length.out = length(mydata)/50 + 1))) mydata.hist$counts Sarah On Thu, Jul 5, 2012 at 3:37 PM, Jim Silverton jim.silver...@gmail.com wrote: Thanks Sarah!! Ok so if I have say x = runif(1000,0,1) say instead if the normal and I want a histogram with bins that have an equal number of observations. For example if I want each bin to have 50 observations, how do I do this? On Thu, Jul 5, 2012 at 3:34 PM, Sarah Goslee sarah.gos...@gmail.com wrote: Hi Jim, You can't specify both number of bins and bin size. You can specify breaks: either the number of bins or the location of breakpoints. A histogram with 20 bins of 50 observations each must by definition come from a uniform distribution. What are you trying to accomplish? Sarah On Thu, Jul 5, 2012 at 3:29 PM, Jim Silverton jim.silver...@gmail.com wrote: I have a column of 1000 datapoints from the normal distribution with mean 2 and variance 4. How can I get a histogram of these observations with 20 bins with each bin having 50 observations? -- Thanks, Jim. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] vector entry in matix
To second Bert Gunter: you may get better answers if you give us a complete description. On Thu, Jul 5, 2012 at 12:37 PM, Thomas C. thomas.csa...@gmail.com wrote: i have number of triangles which i'd like to store in a list, matrix,...etc. i thought it could look sth. like that: trianglenode1node2node3 . 1 c(x1,y1) c(x2,y3) c(x3,y3) 1 c(x4,y4) c(x5,y5) c(x6,y6) You already indicate that that's not the whole truth, but if it was you could consider one of two approaches: * Rather than a 2D array (also called matrix) use a 3D array, with dimensions n.triangle x 3 x 2. * You could use a n.triangle x 3 matrix of complex numbers. That may look like a sneaky trick to put a vector into a matrix element, but some operations of planar geometry can written elegantly. E.g., distance between a and b is abs(a-b). It all depends on what you want to do. BTW, what do you consider a neighbor of a triangle? * * If you are working with a triangle mesh, it might be better to hold the vertices in a double matrix (n.vertex x 2) and represent the triangles as indices into it, i.e., as an integer matrix (n.triangle x 3). You may use more columns in either matrix to represent more attributes. If you find out you need columns of different types, consider a data frame. /Christian [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] NADA Data Frame Format: Wide or Long?
On Thu, 5 Jul 2012, MacQueen, Don wrote: This example follows exactly the example in ?cenros. with( subset(yourdataframe, param=='Ag'), cenros(quant,ceneq1) ) This should do a simple censored summary statistica calculation for silver (assuming quant contains your reporting level for censored results, which appears to be the case). Don, That makes sense to me. I was hoping to avoid subsetting the data frame for each of the 37 chemical parameters, but ... I will review the use of with(). I'd also suggest you try to load your data so that site and param are not factors, though this could depend on your ultimate analysis. I do need to differentiate results by site and chemical paramater. Many thanks, Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] vector entry in matix
Hi Thomas, This is non trivial to do, but if you will be working with this sort of data and are inclined to do some programming, you might consider creating a new class. S4 classes and methods are quite flexible, and you can allow them to lean on or inherit from existing classes such as matrices. For example, your class might consist of three lists and a data frame. The length of each list would be forced to be equal to the number of rows in the data frame. Each element of each list would support a length 2 numeric vector (or integer vector or whatever it is you need). THe lists then would hold nodes 1 - 3, and the data frame would hold other information. You can similarly write methods so that for your special class, d[1, 1] would return c(x1, y1), essentially you create a mix of lists and a matrix with all the necessary constraints, and you develop methods that allow you to operate on this in a way that is sensible for your data. This is stab in the dark, but if you happen to be dealing with social network style data, (triangles and neighbors make me think of that), you should investigate existing packages for that as they may already have their own classes/have a way they expect data to be structured. Off the top of my head, the package sna comes to mind as a starting place to look (though I am not personally very familiar with it). Cheers, Josh On Thu, Jul 5, 2012 at 9:37 AM, Thomas C. thomas.csa...@gmail.com wrote: well thank you, i think we get closer to my problem. but i meant it in a different way. maybe i describe what i intended to do: i have number of triangles which i'd like to store in a list, matrix,...etc. i thought it could look sth. like that: trianglenode1node2node3 . 1 c(x1,y1) c(x2,y3) c(x3,y3) 1 c(x4,y4) c(x5,y5) c(x6,y6) . . . i know that i could just make more rows for the x/y values but the problem is, that i need to store more than these properties (say the neighbours of these nodes and their coordinates) and i thought it would be more convenient to save these as vectors. it was just a thought... is this possible? -- View this message in context: http://r.789695.n4.nabble.com/vector-entry-in-matix-tp4635478p4635509.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] trouble installing Rmpi on a debian machine
Dear R People: I'm having trouble installing Rmpi on a debian machine. Here is my output: bccd@node000:~$ /bccd/home/bccd bccd@node000:~$ bccd@node000:~$ export RMPI_TYPE=OPENMPI bccd@node000:~$ R CMD INSTALL Rmpi_0.5-9.tar.gz * installing to library '/bccd/home/bccd/R/x86_64-pc-linux-gnu-library/2.15' * installing *source* package 'Rmpi' ... checking for gcc... gcc -std=gnu99 checking for C compiler default output file name... a.out checking whether the C compiler works... yes checking whether we are cross compiling... no checking for suffix of executables... checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether gcc -std=gnu99 accepts -g... yes checking for gcc -std=gnu99 option to accept ISO C89... none needed checking how to run the C preprocessor... gcc -std=gnu99 -E checking for grep that handles long lines and -e... /bin/grep checking for egrep... /bin/grep -E checking for ANSI C header files... yes checking for sys/types.h... yes checking for sys/stat.h... yes checking for stdlib.h... yes checking for string.h... yes checking for memory.h... yes checking for strings.h... yes checking for inttypes.h... yes checking for stdint.h... yes checking for unistd.h... yes checking mpi.h usability... yes checking mpi.h presence... yes checking for mpi.h... yes Trying to find libmpi.so or libmpich.a ... checking for main in -lmpi... yes checking for openpty in -lutil... yes checking for main in -lpthread... yes configure: creating ./config.status config.status: creating src/Makevars ** Creating default NAMESPACE file ** libs gcc -std=gnu99 -I/usr/share/R/include -DNDEBUG -DPACKAGE_NAME=\\ -DPACKAGE_TARNAME=\\ -DPACKAGE_VERSION=\\ -DPACKAGE_STRING=\\ -DPACKAGE_BUGREPORT=\\ -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -I -DMPI2 -DOPENMPI -fpic -O3 -pipe -g -c RegQuery.c -o RegQuery.o gcc -std=gnu99 -I/usr/share/R/include -DNDEBUG -DPACKAGE_NAME=\\ -DPACKAGE_TARNAME=\\ -DPACKAGE_VERSION=\\ -DPACKAGE_STRING=\\ -DPACKAGE_BUGREPORT=\\ -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -I -DMPI2 -DOPENMPI -fpic -O3 -pipe -g -c Rmpi.c -o Rmpi.o In file included from Rmpi.c:18: Rmpi.h:2:17: error: mpi.h: No such file or directory In file included from Rmpi.c:18: Rmpi.h:15: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'mpitype' Rmpi.c:24: error: expected '=', ',', ';', 'asm' or '__attribute__' before '*' token Rmpi.c:25: error: expected '=', ',', ';', 'asm' or '__attribute__' before '*' token Rmpi.c:26: error: expected '=', ',', ';', 'asm' or '__attribute__' before '*' token Rmpi.c:27: error: expected '=', ',', ';', 'asm' or '__attribute__' before '*' token Rmpi.c:28: error: expected '=', ',', ';', 'asm' or '__attribute__' before '*' token Rmpi.c: In function 'mpi_initialize': Rmpi.c:57: warning: implicit declaration of function 'MPI_Initialized' Rmpi.c:76: warning: implicit declaration of function 'MPI_Init' Rmpi.c:81: warning: implicit declaration of function 'MPI_Errhandler_set' Rmpi.c:81: error: 'MPI_COMM_WORLD' undeclared (first use in this function) Rmpi.c:81: error: (Each undeclared identifier is reported only once Rmpi.c:81: error: for each function it appears in.) Rmpi.c:81: error: 'MPI_ERRORS_RETURN' undeclared (first use in this function) Rmpi.c:82: error: 'MPI_COMM_SELF' undeclared (first use in this function) Rmpi.c:83: error: 'comm' undeclared (first use in this function) Rmpi.c:83: error: 'MPI_Comm' undeclared (first use in this function) Rmpi.c:83: error: expected expression before ')' token Rmpi.c:83: error: expected ';' before 'R_chk_calloc' Rmpi.c:84: error: 'status' undeclared (first use in this function) Rmpi.c:84: error: 'MPI_Status' undeclared (first use in this function) Rmpi.c:84: error: expected expression before ')' token Rmpi.c:84: error: expected ';' before 'R_chk_calloc' Rmpi.c:85: error: 'datatype' undeclared (first use in this function) Rmpi.c:85: error: 'MPI_Datatype' undeclared (first use in this function) Rmpi.c:85: error: expected expression before ')' token Rmpi.c:85: error: expected ';' before 'R_chk_calloc' Rmpi.c:86: error: 'info' undeclared (first use in this function) Rmpi.c:86: error: 'MPI_Info' undeclared (first use in this function) Rmpi.c:86: error: expected expression before ')' token Rmpi.c:86: error: expected ';' before 'R_chk_calloc' Rmpi.c:87: error: 'MPI_INFO_NULL' undeclared (first use in this function) Rmpi.c:88: error: 'request' undeclared (first use in this function) Rmpi.c:88: error: 'MPI_Request' undeclared (first use in this function) Rmpi.c:88: error: expected expression before ')' token Rmpi.c:88: error: expected ';' before 'R_chk_calloc' Rmpi.c:89: error: 'MPI_REQUEST_NULL' undeclared (first
[R] trouble installing Rmpi on a debian machine: please ignore
It was looking for the mpi.h file. Sorry for the trouble. -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodg...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] EM algorithm to find MLE of coeff in mixed effects model
Seems that I found the problem. The degree of freedom was missing as a denominator... Thank you for your help. Best wishes, Jie On Thu, Jul 5, 2012 at 9:28 AM, John Kane jrkrid...@inbox.com wrote: I don't believe that R-help permits pdf files. A useful workaround is to post it to a file hosting site like MediaFire and post the link here. John Kane Kingston ON Canada -Original Message- From: jimmycl...@gmail.com Sent: Wed, 4 Jul 2012 22:56:02 -0400 To: pauljoh...@gmail.com Subject: Re: [R] EM algorithm to find MLE of coeff in mixed effects model Dear Paul, Thank you for your suggestion. I was moved by the fact that people are so nice to help learners and ask for nothing. With your help, I made some changes to my code and add more comments, try to make things more clear. If this R email list allow me to upload a pdf file to illustrate the formula, it will be great. The reason I do not use lme4 is that later I plan to use this for a more complicated model (Y,T), Y is the same response of mixed model and T is a drop out process. (Frankly I did it for the more complicated model first and got NaN hessian after one iteration, so I turned to the simple model.) In the code, the m loop is the EM iterations. About efficiency, thank you again for pointing it out. I compared sapply and for loop, they are tie and for loop is even better sometimes. In a paper by Bates, he mentioned that the asymptotic properties of beta is kind of independent of the other two parameters. But it seems that paper was submitted but I did not find it on google scholar. Not sure if it is the reason for my results. That is why I hope some expert who have done some simulation similar may be willing to give some suggestion. I may turn to other methods to calculate the conditional expection of the latent variable as the same time. Modified code is as below: # library(PerformanceAnalytics) # install.packages(gregmisc) # install.packages(statmod) library(gregmisc) library(MASS) library(statmod) ## function to calculate loglikelihood loglike - function(datai = data.list[[i]], vvar = var.old, beta = beta.old, psi = psi.old) { temp1 - -0.5 * log(det(vvar * diag(nrow(datai$Zi)) + datai$Zi %*% psi %*% t(datai$Zi))) temp2 - -0.5 * t(datai$yi - datai$Xi %*% beta) %*% solve(vvar * diag(nrow(datai$Zi)) + datai$Zi %*% psi %*% t(datai$Zi)) %*% (datai$yi - datai$Xi %*% beta) temp1 + temp2 } ## functions to evaluate the conditional expection, saved as Efun v2.R ## Eh1new=E(bibi') ## Eh2new=E(X'(y-Zbi)) ## Eh3new=E(bi'Z'Zbi) ## Eh4new=E(Y-Xibeta-Zibi)'(Y-Xibeta-Zibi) ## Eh5new=E(Xibeta-yi)'Zibi ## Eh6new=E(bi) Eh1new - function(datai = data.list[[i]], weights.m = weights.mat) { bi - datai$b tempb - bi * rep(sqrt(weights.m[, 1] * weights.m[, 2]), 2) #quadratic b, so needsqrt t(tempb) %*% tempb/pi } # end of Eh1 # Eh2new=E(X'(y-Zbi)) Eh2new - function(datai = data.list[[i]], weights.m = weights.mat) { bi - datai$b tempb - bi * rep(weights.m[, 1] * weights.m[, 2], 2) tt - t(datai$Xi) %*% datai$yi - t(datai$Xi) %*% datai$Zi %*% colSums(tempb)/pi c(tt) } # end of Eh2 # Eh3new=E(b'Z'Zbi) Eh3new - function(datai = data.list[[i]], weights.m = weights.mat) { bi - datai$b tempb - bi * rep(sqrt(weights.m[, 1] * weights.m[, 2]), 2) #quadratic b, so need sqrt sum(sapply(as.list(data.frame(datai$Zi %*% t(tempb))), function(s) { sum(s^2) }))/pi } # end of Eh3 # Eh4new=E(Y-Xibeta-Zibi)'(Y-Xibeta-Zibi) Eh4new - function(datai = data.list[[i]], weights.m = weights.mat, beta = beta.old) { bi - datai$b tt - sapply(as.list(ata.frame(c(datai$yi - datai$Xi %*% beta) - datai$Zi %*% t(bi))), function(s) { sum(s^2) }) * (weights.m[, 1] * weights.m[, 2]) sum(tt)/pi } # end of Eh4 Eh4newv2 - function(datai = data.list[[i]], weights.m = weights.mat, beta = beta.old) { bi - datai$b v - weights.m[, 1] * weights.m[, 2] temp - c() for (i in 1:length(v)) { temp[i] - sum(((datai$yi - datai$Xi %*% beta - datai$Zi %*% bi[i, ]) * sqrt(v[i]))^2) } sum(temp)/pi } # end of Eh4 # Eh5new=E(Xibeta-yi)'Zib Eh5new - function(datai = data.list[[i]], weights.m = weights.mat, beta = beta.old) { bi - datai$b tempb - bi * rep(weights.m[, 1] * weights.m[, 2], 2) t(datai$Xi %*% beta - datai$yi) %*% (datai$Zi %*% t(tempb)) sum(t(datai$Xi %*% beta - datai$yi) %*% (datai$Zi %*% t(tempb)))/pi } Eh6new - function(datai = data.list[[i]], weights.m = weights.mat) { bi - datai$b tempw - weights.m[, 1] * weights.m[, 2] for (i in 1:nrow(bi)) { bi[i, ] - bi[i, ] * tempw[i] }
Re: [R] GEE with Inverse Probability Weights
If you're going to reply to something from two weeks ago, it's helpful to include more of the conversation. However, the mechanism is straightforward. The standard error estimator assumes only that observations in different clusters are independent: it approximates the variance of the estimating functions by the empirical variance of the cluster totals of the estimating functions, and uses the delta method to convert this to a variance for the coefficients. It's the same as GEE. In this simple setting it's the same as the GEE variance estimator. - thomas On Fri, Jul 6, 2012 at 7:40 AM, Joshua Wiley jwiley.ps...@gmail.com wrote: Hi Frank, It clusters by twin, that is why in Dr. Lumley's example, the id was twin pair, not individual, and the SE is adjusted accordingly. Cheers, Josh On Thu, Jul 5, 2012 at 12:10 PM, RFrank spark...@gmail.com wrote: Thanks -- extremely helpful. But what is the mechanism by which this analysis corrects for the fact that my subjects are clustered (twins)? -- View this message in context: http://r.789695.n4.nabble.com/GEE-with-Inverse-Probability-Weights-tp4633172p4635533.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Thomas Lumley Professor of Biostatistics University of Auckland __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fwd: An application for a master's student in statistics
Profesor Simon N. Wood As a student from the Masters on statistics at Nacional University I am writing to you to ask for some information due to knowledge on the field I am working on I wonder if you are so kind as to help out with the mathematical and stadistical details as well as information about the funtion predict.gam {mgcv} which is part of the packet mgcv from the software R. I really appreciate both the interest and time you can give to my request. Your faithfully Arlington Zahìr Llerena Bogota, Colombia -- * * * El alma se nutre a través del silencio, el estudio, el consumo justo, el contacto con la naturaleza y el conocimiento de uno mismo Alberto D. Fraila Oliver Arlington Zahìr Llerena...!!!* -- * * * El alma se nutre a través del silencio, el estudio, el consumo justo, el contacto con la naturaleza y el conocimiento de uno mismo Alberto D. Fraila Oliver Arlington Zahìr Llerena...!!!* [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Plotting the probability curve from a logit model with 10 predictors
I have a logit model with about 10 predictors and I am trying to plot the probability curve for the model. Y=1 = 1 / 1+e^-z where z=B0 + B1X1 + ... + BnXi If the model had only one predictor, I know to do something like below. mod1 = glm(factor(won) ~ as.numeric(bid), data=mydat, family=binomial(link=logit)) all.x - expand.grid(won=unique(won), bid=unique(bid)) y.hat.new - predict(mod1, newdata=all.x, type=response) plot(bid-000:250,predict(mod1,newdata=data.frame(bid-c(000:250)),type=response), lwd=5, col=blue, type=l) I'm not sure how to proceed when I have 10 or so predictors in the logit model. Do I simply expand the expand.grid() function to include all the variables? So my question is how do I form a plot of a logit probability curve when I have 10 predictors? would be nice to do this in ggplot2. Thanks! -- *Abraham Mathew Statistical Analyst www.amathew.com 720-648-0108 @abmathewks* [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Silencing the output of install.packages()
On Thu, Jul 5, 2012 at 1:03 PM, craigcitro craigci...@gmail.com wrote: Is there a way to suppress the output of 'install.packages()'? I have seen that the 'download.file' function has a 'quiet' option but I do not know how to use it. I do not see any good reason to allow that. A user should see if software is being installed. Hi Uwe, I have a proposed use-case. We run a series of unit tests in R, and want to plug them into various continuous build/test frameworks. Part of our system involves being able to install packages, so we need to actually run a unit test that uses `install.packages`. For better or worse, many of these depend on parsing the output of a test run; it's a lot of silly work to make them parse the additional `install.packages` output (as opposed to the much simpler `[\w]+: \.+` output from something like testthat in the case of success). Two questions: * Is there another workaround? I'm planning on some ugly hackery on our side to deal with this, but would love something cleaner. * If I were willing to write the patch, would turning these statements into something we could suppress be acceptable for R? I think this has already been done in r59493ff of the svn repos by the indefatigable Prof. Ripley Best, Michael Thanks! -cc -- View this message in context: http://r.789695.n4.nabble.com/Silencing-the-output-of-install-packages-tp4631947p4635525.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plotting the probability curve from a logit model with 10 predictors
You have an about 11-D response surface, not a curve! -- Bert On Thu, Jul 5, 2012 at 2:39 PM, Abraham Mathew abmathe...@gmail.com wrote: I have a logit model with about 10 predictors and I am trying to plot the probability curve for the model. Y=1 = 1 / 1+e^-z where z=B0 + B1X1 + ... + BnXi If the model had only one predictor, I know to do something like below. mod1 = glm(factor(won) ~ as.numeric(bid), data=mydat, family=binomial(link=logit)) all.x - expand.grid(won=unique(won), bid=unique(bid)) y.hat.new - predict(mod1, newdata=all.x, type=response) plot(bid-000:250,predict(mod1,newdata=data.frame(bid-c(000:250)),type=response), lwd=5, col=blue, type=l) I'm not sure how to proceed when I have 10 or so predictors in the logit model. Do I simply expand the expand.grid() function to include all the variables? So my question is how do I form a plot of a logit probability curve when I have 10 predictors? would be nice to do this in ggplot2. Thanks! -- *Abraham Mathew Statistical Analyst www.amathew.com 720-648-0108 @abmathewks* [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.