Re: [R] SLR 500 Random Samples
it seems that this is a homework! Please read carefully the Posting Guide: http://www.r-project.org/posting-guide.html -- the R-help mailing list is not intended for "Basic Statistics and classroom homework"! Best, Dimitris Schwonka wrote: Consider the SLR y = 50 + 10x + e where e is NID(0.16). n = 20 pairs of observations. Generate 500 samples of 20 obersvations drawing 1 observations drawing one observation from each level of x = .5,1,1.5 ... 10 for each sample A) For each sample computer the least squares regression estimates of the slop and intercept. Construct histograms of the sample values of B_0_hat and B_1_hat. B) For each sample compute an estimate of E(y|x=5) Construct a histogram from these estimates. C) For each sample compute a 95% CI on the slope. How manyt of these contain the true value B_1 = 10 D) For each estimate of E(y|x=5) in part b compute the 95% CI. How many of these intervals contain the trye value of E(y|x=5)=100? I have 0 idea how to do these as R was not a prereq for my regression course but is needed for these problems. As much help as you could offer would be amazing. -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] SLR 500 Random Samples
Consider the SLR y = 50 + 10x + e where e is NID(0.16). n = 20 pairs of observations. Generate 500 samples of 20 obersvations drawing 1 observations drawing one observation from each level of x = .5,1,1.5 ... 10 for each sample A) For each sample computer the least squares regression estimates of the slop and intercept. Construct histograms of the sample values of B_0_hat and B_1_hat. B) For each sample compute an estimate of E(y|x=5) Construct a histogram from these estimates. C) For each sample compute a 95% CI on the slope. How manyt of these contain the true value B_1 = 10 D) For each estimate of E(y|x=5) in part b compute the 95% CI. How many of these intervals contain the trye value of E(y|x=5)=100? I have 0 idea how to do these as R was not a prereq for my regression course but is needed for these problems. As much help as you could offer would be amazing. -- View this message in context: http://n4.nabble.com/SLR-500-Random-Samples-tp1555769p1555769.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating a new Access database with R
I have just got an idea which is using RExcel and coding with Excel VBA may cope with it, its work!!! Ryusuke From: urishim...@optiver.com To: ryusukeke...@hotmail.com Date: Mon, 15 Feb 2010 08:31:49 +0100 Subject: RE: RE:Creating a new Access database with R No idea, sorry. Haven$B!G(Bt checked it since. You might want to look at simply creating a new file titled $B!F(Bjustanothername.mdb$B!G(B and see if it works (probably not) Sorry I can$B!G(Bt be more helpful, which is why I keep this off the list$B!D(B Uri From: Ryusuke Kenji [mailto:ryusukeke...@hotmail.com] Sent: Thursday 11 February 2010 16:51 To: Uri Shimron; r-help@r-project.org Subject: RE:Creating a new Access database with R I am facing the same problem as well, I would like to code with following concept but wondering how to cope it. if (*mdb file exist) { add new row/col } else { add new *mdb file } -- From: Uri Shimron Date: Thu 21 Sep 2006 - 13:27:35 GMT First of all, since this is my first posting, I would like to thank anybody who [[elided Hotmail spam]] My question is: how do I create a new Access database with R? I need a channel before I can do anything, but if the mdb-file doesn't exist, I can't connect to it with odbcConnectAccess. I've looked at the RODBC.pdf on CRAN, searched the mailing-lists, and looked at test.R file in the package. But probably I've overlooked something. It is of course possible to keep a clean new mdb-file somewhere and then copy it to the required directory with: shell("copy EmptyDB.mdb NewLocation.mdb") But that isn't very elegant... Thanks in advance, Uri Shimron USB$B%a%b%jBe$o$j$K$*;H$$$/$...@$5$$!#l5na$g;H$($k(B25GB$B!#(B SkyDrive$B$r:#$9$0BN83(B *** This email and any files transmitted with it are confide...{{dropped:23}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lm function in R
Dear Something, Ah, you need accessor functions. Set your lm() call to a variable name. If my data frame or object is called dat, for example, I usually use dat.lm. summary(dat.lm) resid(dat.lm) ?summary.lm gives a host of information, including summary(dat.lm)$sigma and $df for resolving SSE. Alternatively, sum(c(resid(dat.lm)^2)). See anova() as well. Also, vif() [car] for the variance inflation factor (e.g. tolerance=1/vif) Hope this helps. Sincerely, KeithC. From: Something Something [mailto:mailinglist...@gmail.com] Sent: Sunday, February 14, 2010 11:49 PM To: kMan Subject: Re: [R] lm function in R Thank you so much, kMan. That makes sense. Only one question, how can I see the value of 'error'? Here's what I see: Call: lm(formula = Y ~ X1 * X2 * X3, na.action = na.exclude) Coefficients: (Intercept) X1 X2 X3X1:X2X1:X3 X2:X3 X1:X2:X3 177.9245 0.2005 2.4482 3.1216 0.8127 -26.6166 -3.0398 29.6049 I don't see 'error' here. Please explain. Thanks again for your help. On Sun, Feb 14, 2010 at 3:26 PM, kMan wrote: Dear Something, Tricky to differentiate subscripts vs. different variables in plain text, so I'll just use 2, X and Z. Y~X+Z yields the model Y=b0 + b1*X + b2*Z + error Y~X*Z yields the model Y=b0 + b1*X + b2*Z + b3*X*Z + error |b3 is the interaction term. Y~X+Z+I(X*Z) is the same thing but with the interaction term spelled out. See ?I XZint<-X*Z Y~X+Z+XZint same thing but avoids the call to I() in the model. Note that: Y~a*a or Y~a^2 does not evaluate an interaction, hence the use of I(). Sincerely, KeithC. -Original Message- From: Something Something [mailto:mailinglist...@gmail.com] Sent: Saturday, February 13, 2010 2:24 PM To: Daniel Nordlund Cc: r-help@r-project.org Subject: Re: [R] lm function in R Thanks Dan. Yes that was very helpful. I didn't see the change from '*' to '+'. Seems like when I put * it means - interaction & when I put + it's not an interaction. Is it correct to assume then that... When I put + R evaluates the following equation: Y-Hat = b0 + b1X1 + b2X2 + . . . bkXk + . . . + bkXk But when I put * R evaluates the following equation; Y-Hat = b0 + b1X1 + b2x2 + ... + bkXk + b12 X12+ b13 X13 + + c Is this correct? If it is then can someone point me to any sources that will explain how the coefficients (such as b0... bk, b12.. , b123..) are calculated. I guess, one source is the R source code :) but is there any other documentation anywhere? Please let me know. Thanks. On Fri, Feb 12, 2010 at 5:54 PM, Daniel Nordlund wrote: > > -Original Message- > > From: r-help-boun...@r-project.org > > [mailto:r-help-boun...@r-project.org] > > On Behalf Of Something Something > > Sent: Friday, February 12, 2010 5:28 PM > > To: Phil Spector; r-help@r-project.org > > Subject: Re: [R] lm function in R > > > > Thanks for the replies everyone. Greatly appreciate it. Some > > progress, but now I am getting the following values when I don't use > > "as.factor" > > > > 13.14167 25.11667 28.34167 49.14167 40.39167 66.86667 > > > > Is that what you guys get? > > > If you look at Phil's response below, no, that is not what he got. > The difference is that you are specifying an interaction, whereas Phil > did not (because the equation you initially specified did not include > an interaction. Use Y ~ X1 + X2 instead of Y ~ X1*X2 for your formula. > > > > > > > On Fri, Feb 12, 2010 at 5:00 PM, Phil Spector > > wrote: > > > > > By converting the two variables to factors, you are fitting an > > > entirely different model. Leave out the as.factor stuff and it > > > will work exactly as you want it to. > > > > > > dat > > >> > > > V1 V2 V3 V4 > > > 1 s1 14 4 1 > > > 2 s2 23 4 2 > > > 3 s3 30 7 2 > > > 4 s4 50 7 4 > > > 5 s5 39 10 3 > > > 6 s6 67 10 6 > > > > > >> names(dat) = c('id','y','x1','x2') z = lm(y~x1+x2,dat) > > >> predict(z) > > >> > > > 123456 15.16667 24.7 > > > 27.7 46.7 40.16667 68.7 > > > > > > > > >- Phil Spector > > > Statistical Computing Facility > > > Department of Statistics > > > UC Berkeley > > > spec...@stat.berkeley.edu > > > > > > > > > > > > On Fri, 12 Feb 2010, Something Something wrote: > > > > > > Hello, > > >> > > >> I am trying to learn how to perform Multiple Regression Analysis in R. > > I > > >> decided to take a simple example given in this PDF: > > >> http://www.utdallas.edu/~herve/abdi-prc-pretty.pdf > > >> > > >> I created a small CSV called, students.csv that contains the > > >> following > > >> data: > > >> > > >> s1 14 4 1 > > >> s2 23 4 2 > > >> s3 30 7 2 > > >> s4 50 7 4 > > >> s5 39 10 3 > > >> s6
Re: [R] lattice/ylim: how to fix ylim[1], but have ylim[2] dynamically calculated?
Rolf Turner wrote: > > On 15/02/2010, at 9:40 AM, Johannes Graumann wrote > > > > (In response to some advice from David Winsemius): > >> I am quite certain that this is the most elaborately worded version of >> "RTFM" I have ever come across. > > > I nominate this as a fortune. (Despite Prof. Winsemius's later > protestation that his advice was *not* a version of RTFM.) Uh, oh ... certified notoriety? Joh __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plot different regression models on one graph
Dear Peter, Ah, I see your point, Professor. The point at x=23.5 is carrying the model. Allow me to clarify. I was making a similar point. I was suggesting that the cube term could be left in the model, as you did, but rather than dropping the data-point, another model is fit with an additional parameter added to control for the suspected outlier - essentially a point discontinuity. Note that the call to poly() is not necessary for the graphical representation, so I'll continue the example without it. fm3<-lm(y~x+I(x^2)+I(x^3), data=dat) fm3.c<-coef(fm3) # drop data point alternative subseting using logical indexes. index<-dat$x!=23.5 x2<-x[index] x2<-dat$x[index] y2<-dat$y[index] fm4<-lm(y2~x2+I(x2^2)+I(x2^3), data=dat) # controlling for the potential outlier in the model, without throwing out the data. ctrl<-rep(0,length=dim(dat)[1]) ctrl[dat$x==23.5]<-resid(fm3)[[dat$x==23.5] fm5<-lm(y~x+I(x^2)+I(x^3)+ctrl, data=dat) # avoids the predict & lines calls, but requires knowledge of the model. curve(fm3.c[1]+fm3.c[2]*x+fm3.c[3]*x^2+fm3.c[4]*x^3, col="green", add=TRUE) curve(fm4.c[1]+fm4.c[2]*x+fm4.c[3]*x^2+fm4.c[4]*x^3, col="green", add=TRUE) # plot dropped outlier curve(fm5.c[1]+fm5.c[2]*x+fm5.c[3]*x^2+fm5.c[4]*x^3, col="purple", add=TRUE) # plot without ctrl variable anova(fm5) Tells us how much difference one point made, sacrificing a df just to ensure we're not throwing out information haphazardly. F>1 or whatever cutoff you want to use. If it turns out that the point is important, then at least its existence & effect was documented. There is some irony, I suppose, that the graphical representation of dropping the point vs. adding a control variable, is equivalent for this example. I have not decided how I feel about that yet, but I do have a splitting headache. Sincerely, KeithC. -Original Message- From: Peter Ehlers [mailto:ehl...@ucalgary.ca] Sent: Sunday, February 14, 2010 10:04 PM To: kMan Cc: r-help@r-project.org Subject: Re: [R] Plot different regression models on one graph kMan wrote: > Peter wrote: >> You like to live dangerously. > > Clue me in, Professor. > > Sincerely, > KeithC. > Okay, Keith, here goes: dat <- structure(list(x = c(62.5, 68.5, 0, 52, 0, 52, 0, 52, 23.5, 86, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), y = c(0.054, 0.055, 0.017, 0.021, 0.02, 0.028, 0.032, 0.073, 0.076, 0.087, 0.042, 0.042, 0.041, 0.045, 0.021, 0.018, 0.017, 0.018, 0.028, 0.022)), .Names = c("x", "y"), row.names = c(NA, -20L), class = "data.frame") fm1 <- lm(y ~ poly(x,3), data = dat) fm2 <- lm(y ~ poly(x,3), data = dat, subset = -9) xx <- 0:86 yhat1 <- predict(fm1, list(x = xx)) yhat2 <- predict(fm2, list(x = xx)) plot(x,y) lines(xx, yhat1, col="blue", lwd=2) lines(xx, yhat2, col="red", lwd=2) That's how much difference *one* point makes in a cubic fit! I'm not much of a gambler, so I wouldn't base any important decisions on the results of the fit. Cheers, -Peter Ehlers > -Original Message- > From: Peter Ehlers [mailto:ehl...@ucalgary.ca] > Sent: Sunday, February 14, 2010 6:49 PM > To: kMan > Cc: 'David Winsemius'; 'Rhonda Reidy'; r-help@r-project.org > Subject: Re: [R] Plot different regression models on one graph > > kMan wrote: >> I would use all of the data. If you want to "drop" one, control for >> it in the model & sacrifice a degree of freedom. > > You like to live dangerously. > > -Peter Ehlers > >> Why the call to poly() by the way? >> >> KeithC. >> >> -Original Message- >> From: Peter Ehlers [mailto:ehl...@ucalgary.ca] >> Sent: Saturday, February 13, 2010 1:35 PM >> To: David Winsemius >> Cc: Rhonda Reidy; r-help@r-project.org >> Subject: Re: [R] Plot different regression models on one graph >> >> Rhonda: >> >> As David points out, a cubic fit is rather speculative. I think that >> one needs to distinguish two situations: 1) theoretical justification >> for a cubic model is available, or 2) you're dredging the data for the "best" > fit. >> Your case is the second. That puts you in the realm of EDA >> (exploratory > data >> analysis). You're free to fit any model you wish, but you should >> assess > the >> value of the model. One convenient way is to use the >> influence.measures() function, which will show that points 9 and 10 >> in your data have a strong influence on your cubic fit (as, of >> course, your eyes would tell you). A good thing to do at this point is to fit 3 more cubic models: >> 1) omitting point 9, 2) omitting point 10, 3) omitting both. >> >> Try it and plot the resulting fits. You may be surprised. >> >> Conclusion: Without more data, you can conclude nothing about a >> non-straightline fit. >> >> (And this hasn't even addressed the relative abundance of x=0 data.) >> >> -Peter Ehlers >> >> David Winsemius wrote: >>> On Feb 13, 2010, at 1:35 PM, Rhonda Reidy wrote: >>> The following variables have the following significant relationships (x is the explanatory variable): linear, cubic, exponential, logistic.
Re: [R] Separating columns, and sorting by rows
On Feb 15, 2010, at 1:22 AM, milton ruser wrote: Hi Raging Jim may be this is a starting point. myDF<-read.table(stdin(),head=T,sep=",") Those "mm" entries will become factors, which can lead to confusion for newbies. Might be more straightforward to always use stringsAsFactors=FALSE in the read.table arguments. I see that the yyymm column later gets thrown away so it may not matter here. mm,Rainfall 1977-02,17.4 1977-03,34.0 1977-04,26.2 1977-05,42.6 1977-06,58.6 1977-07,23.2 1977-08,26.8 1977-09,48.4 1977-10,47.0 1977-11,37.2 1977-12,15.0 1978-01,2.6 1978-02,6.8 1978-03,9.0 1978-04,46.6 When I did a very similar maneuver, I added an extra NA entry at the beginning: myDF <- rbind(list(mm="1977-01", Rainfall=NA), myDF) ... so the columns would start with January. (The warning is harmless.) myDF$<-substr(myDF$mm,1,4) myDF$mm<-substr(myDF$mm,6,7) myDF<-subset(myDF, select=c(,mm,Rainfall)) myDF.reshape<-reshape(myDF,v.names="Rainfall",idvar="", timevar="mm",direction="wide") myDF.reshape best regards When the time comes to rename those columns, knowing that there is a system constant called month.names may come in handy. Perhaps (untested): names(myDF.reshape) <- c("Year", month.names[1:12]) milton -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Exponential Fitting to Credit default data - A theoretical query
Dear R helpers I am working on the credit risk default data and am referring to "An introduction to Credit Risk Modelling" by Christian Bluhm. The literature affirms that 'the default frequencies grow exponentially with decreasing credit worthiness'. I have a data of rating wise default frequencies. The idea is to fit exponentail of the type Y = a * exp( b*X ) where Y is (dependent variable) default frequency and X is rating class code. So e.g. suppose I have rating classes AAA, AA, A and so on coded as 1, 2, 3 etc. and suppose the observed default frequencies are say 0.00011, 0.0029, 0.0083 respectively, then my values of X and Y are as follows. X Y 1 0.0001 2 0.0029 3 0.0083 .. .. etc and to this data I am fitting Y = a * exp(b*X) My quereis are (1) is there any R function which will find out the estimated values of a and b instead of taking logaritham and converting both sides into linear equation and solve for a and b; (2) How do I find out out Statistically that this is the best fit? e.g. had it been linear regression, the R^2 gives me some idea about the fitted linear line. In otehr words, is there any way of comparing the observed values of Y against the estimated values of Y i.e. values obtained from the equation Y = a * exp(b*X). Will the Chi-Square will be a good choice? I was trying the t-test as given below and am not sure whether I am right to do so. For each rating class, I have the observed value of Y and estimates value. Say for AAA, the observed value is 0.0001 whereas the estimated value is say 0.0017. I have tested the following hypothesis Ho : Y(AAA) = 0.0017 H1 : Y(AAA) 0.0017 So this is a two tailed case and I have applied the 't' test as tcal (AAA) = [Y(obse) - Y(estim)]/(s / sqrt(n)). Probelm is in some cases, I have observed significant difference ie. the t(calculated value) falling in Critical region. Thus, the need to find out whetehr my fit is correct or not. Thanking you in advance Regards Julia Cains, Brisbane ** Only a man of Worth sees Worth in other men ** [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Separating columns, and sorting by rows
Hi Raging Jim may be this is a starting point. myDF<-read.table(stdin(),head=T,sep=",") mm,Rainfall 1977-02,17.4 1977-03,34.0 1977-04,26.2 1977-05,42.6 1977-06,58.6 1977-07,23.2 1977-08,26.8 1977-09,48.4 1977-10,47.0 1977-11,37.2 1977-12,15.0 1978-01,2.6 1978-02,6.8 1978-03,9.0 1978-04,46.6 myDF$<-substr(myDF$mm,1,4) myDF$mm<-substr(myDF$mm,6,7) myDF<-subset(myDF, select=c(,mm,Rainfall)) myDF.reshape<-reshape(myDF,v.names="Rainfall",idvar="", timevar="mm",direction="wide") myDF.reshape best regards milton [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] GAM for non-integer proportions
Dear list, I´m using the mgcv package to model the proportion by weight of certain prey on the stomach content of a predator. This proportion is the ratio of two weights (prey weight over stomach weight), and ranges between 0 and 1. The variance is low when proportion is close to 0 and 1, and higher at intermediate values. It seems that the best way to go is to model this using the "quasi" family with a logit link and a mu(1-mu) variance. Or I am missing something obvious? I will be thankful for any input. All the best, Julian -- Julian Mariano Burgos Hafrannsóknastofnunin/Marine Research Institute Skúlagata 4, 121 Reykjavík, Iceland Sími/Telephone : +354-5752037 Bréfsími/Telefax: +354-5752001 Netfang/Email: jul...@hafro.is, jmbur...@uw.edu [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Separating columns, and sorting by rows
the other alternative would be to edit my sql query so that that data is brought in from the database and put in to the correct format initially. "sqlQuery(conn, "select lsd,ttl_mo_prcp from mo_rains where stn_num=23090")" That is my very basic query. I have also been given this for use in orcale (I believe): my $query = "select to_char(stn_num,'9') as stn, to_char(lsd,'') as yr, nvl(max(decode(to_char(lsd,'MON'),'JAN',ttl_mo_prcp)),-) jan, nvl(max(decode(to_char(lsd,'MON'),'FEB',ttl_mo_prcp)),-) feb, nvl(max(decode(to_char(lsd,'MON'),'MAR',ttl_mo_prcp)),-) mar, nvl(max(decode(to_char(lsd,'MON'),'APR',ttl_mo_prcp)),-) apr, nvl(max(decode(to_char(lsd,'MON'),'MAY',ttl_mo_prcp)),-) may, nvl(max(decode(to_char(lsd,'MON'),'JUN',ttl_mo_prcp)),-) jun, nvl(max(decode(to_char(lsd,'MON'),'JUL',ttl_mo_prcp)),-) jul, nvl(max(decode(to_char(lsd,'MON'),'AUG',ttl_mo_prcp)),-) aug, nvl(max(decode(to_char(lsd,'MON'),'SEP',ttl_mo_prcp)),-) sep, nvl(max(decode(to_char(lsd,'MON'),'OCT',ttl_mo_prcp)),-) oct, nvl(max(decode(to_char(lsd,'MON'),'NOV',ttl_mo_prcp)),-) nov, nvl(max(decode(to_char(lsd,'MON'),'DEC',ttl_mo_prcp)),-) dec from mo_rains where (stn_num in ($stns)) group by stn_num, to_char(lsd,'') order by to_char(lsd,'') desc;"; But I think that sorts by station number, as it is designed for multiple stations at a time, whereas mine is for one station only. Yet if I plug it into R just to see what happens, I get a plethora of extraordinarily long errors which I can post if needed. -- View this message in context: http://n4.nabble.com/Separating-columns-and-sorting-by-rows-tp1555806p1555813.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Separating columns, and sorting by rows
Dear anyone who knows more about R than me (so everyone). I have been bashing my head on the keyboard all day trying to do something with my table. I have some data, like so: -mm Rainfall(mm) 1 1977-0217.4 2 1977-0334.0 3 1977-0426.2 4 1977-0542.6 5 1977-0658.6 6 1977-0723.2 7 1977-0826.8 8 1977-0948.4 9 1977-1047.0 10 1977-1137.2 11 1977-1215.0 12 1978-01 2.6 13 1978-02 6.8 14 1978-03 9.0 15 1978-0446.6 The data continues on for x number of hundreds of data points. Simply put, I need to make that data.frame into this data.frame/table/matrix/grid/... you get the picture. Jan Feb Mar ... etc Year Rain Rain Rain Year Rain Rain Rain Year Rain Rain Rain Year etc... Year Year How on earth do I do it? I have made little to no progress on it all day. Also, just like all the other problems, the year and month will change every time depending upon which csv file or sql query I load into the program. If anyone has any pointers, that would be awesome. Cheers lads. -- View this message in context: http://n4.nabble.com/Separating-columns-and-sorting-by-rows-tp1555806p1555806.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] error message error
Hi r-users, I hope somebody can help me to understand the error message. Here is my code; ## Newton iteration newton_gam <- function(z) { n <- length(z) r <- runif(n) tol <- 1E-6 cdf <- vector(length=n, mode="numeric") fprime <- vector(length=n, mode="numeric") f <- vector(length=n, mode="numeric") for (i in 1:1000) { cdf <- integrate(fprime, lower = 0, upper = z)$value f <- cdf - r # Newton method z <- z - f/fprime if (any(f < tol)) break } cbind(z,cdf) } alp <- 2.0165 bt1 <- 29.107 ; bt2 <- 41.517 x1 <- d1d4pos[,1];x1[1:10] x2 <- d1d4pos[,2];x2[1:10] > x1 <- d1d4pos[,1];x1[1:10] [1] 28.4 53.6 1.3 29.5 52.1 65.9 72.6 67.6 58.7 34.5 > x2 <- d1d4pos[,2];x2[1:10] [1] 43.5 56.2 0.3 16.6 71.1 86.3 172.8 111.8 89.9 70.2 z <- (x1/bt1)+(x2/bt2); z newton_gam(min(z)) newton_gam(max(z)) Error in -`*tmp*` : invalid argument to unary operator > newton_gam(min(z)) z cdf [1,] Inf 1.141004e-05 > newton_gam(max(z)) Error in integrate(fprime, lower = 0, upper = z) : non-finite function value > z[100] [1] 7.544834 > newton_gam(z[100]) Error in integrate(fprime, lower = 0, upper = z) : non-finite function value Thank you. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plot different regression models on one graph
kMan wrote: Peter wrote: You like to live dangerously. Clue me in, Professor. Sincerely, KeithC. Okay, Keith, here goes: dat <- structure(list(x = c(62.5, 68.5, 0, 52, 0, 52, 0, 52, 23.5, 86, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), y = c(0.054, 0.055, 0.017, 0.021, 0.02, 0.028, 0.032, 0.073, 0.076, 0.087, 0.042, 0.042, 0.041, 0.045, 0.021, 0.018, 0.017, 0.018, 0.028, 0.022)), .Names = c("x", "y"), row.names = c(NA, -20L), class = "data.frame") fm1 <- lm(y ~ poly(x,3), data = dat) fm2 <- lm(y ~ poly(x,3), data = dat, subset = -9) xx <- 0:86 yhat1 <- predict(fm1, list(x = xx)) yhat2 <- predict(fm2, list(x = xx)) plot(x,y) lines(xx, yhat1, col="blue", lwd=2) lines(xx, yhat2, col="red", lwd=2) That's how much difference *one* point makes in a cubic fit! I'm not much of a gambler, so I wouldn't base any important decisions on the results of the fit. Cheers, -Peter Ehlers -Original Message- From: Peter Ehlers [mailto:ehl...@ucalgary.ca] Sent: Sunday, February 14, 2010 6:49 PM To: kMan Cc: 'David Winsemius'; 'Rhonda Reidy'; r-help@r-project.org Subject: Re: [R] Plot different regression models on one graph kMan wrote: I would use all of the data. If you want to "drop" one, control for it in the model & sacrifice a degree of freedom. You like to live dangerously. -Peter Ehlers Why the call to poly() by the way? KeithC. -Original Message- From: Peter Ehlers [mailto:ehl...@ucalgary.ca] Sent: Saturday, February 13, 2010 1:35 PM To: David Winsemius Cc: Rhonda Reidy; r-help@r-project.org Subject: Re: [R] Plot different regression models on one graph Rhonda: As David points out, a cubic fit is rather speculative. I think that one needs to distinguish two situations: 1) theoretical justification for a cubic model is available, or 2) you're dredging the data for the "best" fit. Your case is the second. That puts you in the realm of EDA (exploratory data analysis). You're free to fit any model you wish, but you should assess the value of the model. One convenient way is to use the influence.measures() function, which will show that points 9 and 10 in your data have a strong influence on your cubic fit (as, of course, your eyes would tell you). A good thing to do at this point is to fit 3 more cubic models: 1) omitting point 9, 2) omitting point 10, 3) omitting both. Try it and plot the resulting fits. You may be surprised. Conclusion: Without more data, you can conclude nothing about a non-straightline fit. (And this hasn't even addressed the relative abundance of x=0 data.) -Peter Ehlers David Winsemius wrote: On Feb 13, 2010, at 1:35 PM, Rhonda Reidy wrote: The following variables have the following significant relationships (x is the explanatory variable): linear, cubic, exponential, logistic. The linear relationship plots without any trouble. Cubic is the 'best' model, but it is not plotting as a smooth curve using the following code: cubic.lm<- lm(y~poly(x,3)) Try: lines(0:80, predict(cubic.lm, data.frame(x=0:80)),lwd=2) But I really must say the superiority of that relationship over a linear one is far from convincing to my eyes. Seems to be caused by one data point. I hope the quotes around "best" mean tha tyou have the same qualms. lines(x,predict(cubic.lm),lwd=2) How do I plot the data and the estimated curves for all of these regression models in the same plot? x <- c(62.5,68.5,0,52,0,52,0,52,23.5,86,0,0,0,0,0,0,0,0,0,0) y <- c(0.054,0.055,0.017,0.021,0.020,0.028,0.032,0.073,0.076,0.087,0.042,0 .042,0.041,0.045,0.021,0.018,0.017,0.018,0.028,0.022) Thanks in advance. Rhonda Reidy -- Peter Ehlers University of Calgary __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Difference in Levene's test between R and SPSS
Ravi Kulkarni wrote: Hello, I notice that when I do Levene's test to test equality of variances across levels of a factor, I get different answers in R and SPSS 16. e.g.: For the chickwts data, in R, levene.test(weight, feed) gives F=0.7493, p=0.5896. SPSS 16 gives F=0.987, p=0.432 Why this difference? Which one should I believe? (I would like to believe R :) When in doubt, believe R. Which levene.test() are you using? There are at least three functions by that name (packages car, lawstat, s20x; you should always state which package is being used). All default to what I would consider to be the better test, namely using the median as location measure. lawstat gives you options: try it with location="mean". Then switch to the Fligner-Killeen test (fligner.test() in package 'stats'). The reference cited in ?fligner.test makes for good reading. -Peter Ehlers Ravi __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Using text put into a dialog box
I have written in R this: require(tcltk) tt<-tktoplevel() Name <- tclVar("") entry.Name <-tkentry(tt,width="20",textvariable=Name) tkgrid(tklabel(tt,text="Please enter site number.")) tkgrid(entry.Name) OnOK <- function() { NameVal <- tclvalue(Name) use.this=NameVal tkdestroy(tt) } OK.but <-tkbutton(tt,text=" OK ",command=OnOK) tkbind(entry.Name, "",OnOK) tkgrid(OK.but) tkfocus(tt) Fairly simple, yet I am trying to figure out how to save the text that the user writes into the text box. This data will then be used for a query into a SQL database. How do I go about using what is written? Cheers -- View this message in context: http://n4.nabble.com/Using-text-put-into-a-dialog-box-tp1555761p1555761.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Difference in Levene's test between R and SPSS
Hello, I notice that when I do Levene's test to test equality of variances across levels of a factor, I get different answers in R and SPSS 16. e.g.: For the chickwts data, in R, levene.test(weight, feed) gives F=0.7493, p=0.5896. SPSS 16 gives F=0.987, p=0.432 Why this difference? Which one should I believe? (I would like to believe R :) Ravi -- View this message in context: http://n4.nabble.com/Difference-in-Levene-s-test-between-R-and-SPSS-tp1555725p1555725.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Shapiro-Wilk for levels of factor
Thanks! Exactly what I wanted. Ravi -- View this message in context: http://n4.nabble.com/Shapiro-Wilk-for-levels-of-factor-tp1555254p1555720.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] optimization problem with linear constraints
Dear R users, I need some advises on how to use R to optimize this function with the following constraints. f(x1,x2,x3,y1,y2,y3,) = gamma(x1+x2-1)/{gamma(x1)*gamma(x2)} * y1^(x2-1) * y2^(x1-1) + gamma(x1+x3-1)/{gamma(x1)*gamma(x3)} * y1^(x3-1) * y3^(x1-1) + gamma(x2+x3-1)/{gamma(x2)*gamma(x3)} * y2^(x3-1) * y3^(x2-1) s.t 0 < x1 0 < x2 0 < x3 1 < x1+x2 1 < x1+x3 1 < x2+x3 0 < y1 0 < y2 0 < y3 I've tried "constrOptim", but I got several errors; e.g. " value out of range in 'gammafn'", etc. Is there any other built-in functions or something for these constraints?? Any suggestion will be greatly appreciated. Regards, Kathryn Lord -- View this message in context: http://n4.nabble.com/optimization-problem-with-linear-constraints-tp1555718p1555718.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R: Dimensional reduction package
Thank you. Sorry. My question is rather ambiguous. Imeant to aske for Dimensionality Reduction methods and/or Intrinsic Dimensionality Estimation methods that can deal with non-linear data. That is, data that do not lie on a hyper-plane. ISOMAP is one of such methods. LLE, AutoEncoder, GTM Kernel PCA are other methods that are used with non-linear data. Maura -Messaggio originale- Da: Michael Denslow [mailto:michael.dens...@gmail.com] Inviato: dom 14/02/2010 23.46 A: mau...@alice.it Cc: r-h...@stat.math.ethz.ch Oggetto: Re: [R] Dimensional reduction package Hi Maura, > Is there any R package which implements non-linear dimensionality reduction > (LLE, ISOMAP, GTM, and so on) and/or intrinsic dimensionality estimation ? I am not exactly sure what is meant by non-linear dimensionality reduction but there is an isomap function in vegan. This is the isomap of Tenenbaum et al. 2000 and De'ath 1999. library(vegan) ?isomap De'ath, G. (1999) Extended dissimilarity: a method of robust estimation of ecological distances from high beta diversity data. Plant Ecology 144, 191-199 Tenenbaum, J.B., de Silva, V. & Langford, J.C. (2000) A global network framework for nonlinear dimensionality reduction. Science 290, 2319-2323. Hope this helps, Michael > Thank you, > Maura > > > tutti i telefonini TIM! > > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Michael Denslow I.W. Carpenter Jr. Herbarium [BOON] Department of Biology Appalachian State University Boone, North Carolina U.S.A. -- AND -- Communications Manager Southeast Regional Network of Expertise and Collections sernec.org 36.214177, -81.681480 +/- 3103 meters tutti i telefonini TIM! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple missing values
John wrote: ... Does anyone know, or know documentation that describes, how to declare multiple values in R as missing that does not involve coding them as NA? I wish to be able to treate values as missing, while still retaining codes that describe the reason for the value being missing. I would suggest leaving the "missing values" as is in your data file and recoding these to NA at the top of each analysis script you run. I find that the only place I usually make use of such information is in the initial descriptives, although you may want to selectively recode for different analyses. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plot different regression models on one graph
Peter wrote: >You like to live dangerously. Clue me in, Professor. Sincerely, KeithC. -Original Message- From: Peter Ehlers [mailto:ehl...@ucalgary.ca] Sent: Sunday, February 14, 2010 6:49 PM To: kMan Cc: 'David Winsemius'; 'Rhonda Reidy'; r-help@r-project.org Subject: Re: [R] Plot different regression models on one graph kMan wrote: > I would use all of the data. If you want to "drop" one, control for it in > the model & sacrifice a degree of freedom. You like to live dangerously. -Peter Ehlers > > Why the call to poly() by the way? > > KeithC. > > -Original Message- > From: Peter Ehlers [mailto:ehl...@ucalgary.ca] > Sent: Saturday, February 13, 2010 1:35 PM > To: David Winsemius > Cc: Rhonda Reidy; r-help@r-project.org > Subject: Re: [R] Plot different regression models on one graph > > Rhonda: > > As David points out, a cubic fit is rather speculative. I think that one > needs to distinguish two situations: 1) theoretical justification for a > cubic model is available, or 2) you're dredging the data for the "best" fit. > Your case is the second. That puts you in the realm of EDA (exploratory data > analysis). You're free to fit any model you wish, but you should assess the > value of the model. One convenient way is to use the influence.measures() > function, which will show that points 9 and 10 in your data have a strong > influence on your cubic fit (as, of course, your eyes would tell you). A > good thing to do at this point is to fit 3 more cubic models: > 1) omitting point 9, 2) omitting point 10, 3) omitting both. > > Try it and plot the resulting fits. You may be surprised. > > Conclusion: Without more data, you can conclude nothing about a > non-straightline fit. > > (And this hasn't even addressed the relative abundance of x=0 data.) > > -Peter Ehlers > > David Winsemius wrote: >> On Feb 13, 2010, at 1:35 PM, Rhonda Reidy wrote: >> >>> The following variables have the following significant relationships >>> (x is the explanatory variable): linear, cubic, exponential, logistic. >>> The linear relationship plots without any trouble. >>> >>> Cubic is the 'best' model, but it is not plotting as a smooth curve >>> using the following code: >>> >>> cubic.lm<- lm(y~poly(x,3)) >> Try: >> >> lines(0:80, predict(cubic.lm, data.frame(x=0:80)),lwd=2) >> >> But I really must say the superiority of that relationship over a >> linear one is far from convincing to my eyes. Seems to be caused by >> one data point. I hope the quotes around "best" mean tha tyou have the > same qualms. >> >>> lines(x,predict(cubic.lm),lwd=2) >>> >>> How do I plot the data and the estimated curves for all of these >>> regression models in the same plot? >>> >>> x <- c(62.5,68.5,0,52,0,52,0,52,23.5,86,0,0,0,0,0,0,0,0,0,0) >>> >>> y <- >>> c(0.054,0.055,0.017,0.021,0.020,0.028,0.032,0.073,0.076,0.087,0.042,0 >>> .042,0.041,0.045,0.021,0.018,0.017,0.018,0.028,0.022) >>> >>> >>> Thanks in advance. >>> >>> Rhonda Reidy >>> > > -- > Peter Ehlers > University of Calgary > > > > -- Peter Ehlers University of Calgary __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plot different regression models on one graph
kMan wrote: I would use all of the data. If you want to "drop" one, control for it in the model & sacrifice a degree of freedom. You like to live dangerously. -Peter Ehlers Why the call to poly() by the way? KeithC. -Original Message- From: Peter Ehlers [mailto:ehl...@ucalgary.ca] Sent: Saturday, February 13, 2010 1:35 PM To: David Winsemius Cc: Rhonda Reidy; r-help@r-project.org Subject: Re: [R] Plot different regression models on one graph Rhonda: As David points out, a cubic fit is rather speculative. I think that one needs to distinguish two situations: 1) theoretical justification for a cubic model is available, or 2) you're dredging the data for the "best" fit. Your case is the second. That puts you in the realm of EDA (exploratory data analysis). You're free to fit any model you wish, but you should assess the value of the model. One convenient way is to use the influence.measures() function, which will show that points 9 and 10 in your data have a strong influence on your cubic fit (as, of course, your eyes would tell you). A good thing to do at this point is to fit 3 more cubic models: 1) omitting point 9, 2) omitting point 10, 3) omitting both. Try it and plot the resulting fits. You may be surprised. Conclusion: Without more data, you can conclude nothing about a non-straightline fit. (And this hasn't even addressed the relative abundance of x=0 data.) -Peter Ehlers David Winsemius wrote: On Feb 13, 2010, at 1:35 PM, Rhonda Reidy wrote: The following variables have the following significant relationships (x is the explanatory variable): linear, cubic, exponential, logistic. The linear relationship plots without any trouble. Cubic is the 'best' model, but it is not plotting as a smooth curve using the following code: cubic.lm<- lm(y~poly(x,3)) Try: lines(0:80, predict(cubic.lm, data.frame(x=0:80)),lwd=2) But I really must say the superiority of that relationship over a linear one is far from convincing to my eyes. Seems to be caused by one data point. I hope the quotes around "best" mean tha tyou have the same qualms. lines(x,predict(cubic.lm),lwd=2) How do I plot the data and the estimated curves for all of these regression models in the same plot? x <- c(62.5,68.5,0,52,0,52,0,52,23.5,86,0,0,0,0,0,0,0,0,0,0) y <- c(0.054,0.055,0.017,0.021,0.020,0.028,0.032,0.073,0.076,0.087,0.042,0 .042,0.041,0.045,0.021,0.018,0.017,0.018,0.028,0.022) Thanks in advance. Rhonda Reidy -- Peter Ehlers University of Calgary -- Peter Ehlers University of Calgary __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [Requires Classification] Re: Adding a regression line to an xyplot
Thanks, yes that does the trick, sorry inadvertently left the "tavg" in my example. I haven't used the panel function before looks like it maybe useful. Really appreciate your help. regards Andrew McFadden MVS BVSc | Incursion Investigator (animals), Investigation and Diagnostic Centre | Biosecurity New Zealand Ministry of Agriculture and Forestry | 66 Ward St, Wallaceville | PO Box 40 742 | Upper Hutt | New Zealand Telephone: 64-4-894 5611 | Facsimile: 64-4-894 4973| Mobile: 027-733-1791 | Web: www.maf.govt.nz -Original Message- From: David Winsemius [mailto:dwinsem...@comcast.net] Sent: Monday, 15 February 2010 1:02 p.m. To: Andrew McFadden Cc: r-help@r-project.org Subject: [Requires Classification] Re: [R] Adding a regression line to an xyplot On Feb 14, 2010, at 6:33 PM, Andrew McFadden wrote: > Hi R users > >> I am trying to add a regression line to a graph for "c" for factor 2 >> only. Any suggestions? >> > library(lattice) > > a=(1:10) > b=c(1:5,16:20) > c=as.factor(c(rep(1,5),rep(2,5))) > > d=data.frame(a,b,c) > > xyplot(a~b, pch=c(6,8),data = tavg, groups=d$c, Seems likely that you want the dataframe defined above to be used as the argument to 'data='. Otherwise xyplot throws an error unless "tavg" is defined elsewhere. If so, then try: tavg=data.frame(a,b,c) xyplot(a~b, pch=c(6,8),data = tavg, groups=c, panel=function(x,y, groups,...) { panel.xyplot(x,y,groups,type=c("p"),...); panel.lmline(x[c=="2"],y[c=="2"],groups,...)}, xlab="a",ylab="b") > reg.line=lm, smooth=FALSE, type=c("p","g"),xlab="a",ylab="b") > >> I would appreciate your help. >> >> Kind regards >> >> andy >> >> Andrew McFadden MVS BVSc >> Incursion Investigator >> Investigation & Diagnostic Centres - Wallaceville Biosecurity New >> Zealand Ministry of Agriculture and Forestry >> >> Phone 04 894 5600 Fax 04 894 4973 Mobile 029 894 5611 Postal address: >> Investigation and Diagnostic Centre- Wallaceville Box 40742 Ward St >> Upper Hutt >> > > ## > ## This email message and any attachment(s) is intended solely for the > addressee(s) named above. The information it contains is confidential > and may be legally privileged. Unauthorised use of the message, or > the information it contains, may be unlawful. If you have received > this message by mistake please call the sender immediately on 64 4 > 8940100 or notify us by return email and erase the original message > and attachments. Thank you. > > The Ministry of Agriculture and Forestry accepts no responsibility for > changes made to this email or to any attachments after transmission > from the office. > ## > ## > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R project
Those sound like basic questions. You should start by reading the Intro to R that can be found here http://cran.r-project.org/doc/manuals/R-intro.html or at www.r-project.org For your specific questions, searching for "histogram" or "bar chart" at http://www.rseek.org would be very helpful. If you have problems beyond that, putting together a specific example as requested by the posting guide will generally get you much better advice than vague queries. Sarah On Sun, Feb 14, 2010 at 4:05 PM, navid safavi wrote: > > Dear Sir/Madam > > Could you pls help me at these subjects in R-project? > How could I construct a table for 5 or 10 records of data set? > > > How could I construct a bar chart for categorical variable with overlay? > How could I construct a histogram for a categorical variable? > > pls inform me. > Thanks in advanced. > Navid -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] error message in endseq
Hi there, I am new to R and feel so bloody stupid. Abut in spite of a search of several hrs I could not find an answer to my problem. I have imported SPSS data to R, and apart from some warnings regarding duplicate labels, everything looks fine and names lists all my variables. Then I try to run a seqdef command - and get the error message below: > fam.seq<- seqdef(family_seq, var= c ("fseq2_15", "fseq2_16", > "fseq2_17","fseq2_18"," fseq2_19","fseq2_20","fseq2_21", + "fseq2_22", "fseq2_23","fseq2_24", "fseq2_25", "fseq2_26","fseq2_27","fseq2_28", "fseq2_29","fseq2_30", "fseq2_31", "fseq2_32","fseq2_33", + "fseq2_34", "fseq2_35","fseq2_36", "fseq2_37", "fseq2_38","fseq2_39","fseq2_40", "fseq2_41","fseq2_42", "fseq2_43", "fseq2_44","fseq2_45", + "fseq2_46", "fseq2_47","fseq2_48", "fseq2_49", "fseq2_50"), + alphabet =c("1","2","3","4","5","6","7","8"),labels=c("not left-never married-no children","not left -never married - at least 1 child","not left -has married-no children" , "not left - has married - at least 1 child","left - never married- no children" , "left - never married - at least 1 child","left - has married - no children" , "left - has married - at least 1 child")) Error in subset.default(data, , var) : element 1 is empty; the part of the args list of 'is.logical' being evaluated was: (subset) I understand that the problem seems to be with the list of variables I use to define what my sequences are. I tried changing lots of things but nothing helped. Have I skipped an important step? Many thanks for your help (I have to sleep now). nomis -- View this message in context: http://n4.nabble.com/error-message-in-endseq-tp1555607p1555607.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to delete a parameter from list after running negative binomial error
Hello everyone, Sorry if my question is not clear, my first language is not English, but Portuguese. I am building a model for my data, using non-binomial error. I am having a bit of a problem when updating the model to remove parameters that I no do no autocorrelate with other variables (I have used a autocorrelation function for this). So my first model looks like this: model1.glm.nb<-glm.nb(ozone ~ ., data=climate.dat) summary(model1.glm.nb) model1AIC.glm.nb<-stepAIC(model1.glm.nb) When I run it, it doesn't give me any significance. So I run a second model, with my autocorrelation table, and remove one more variable that does not autocorrelate with other (anything below .07) model2.glm.nb2 <- update(model1.glm.nb, ~ . - rain, data=climate.dat) summary(model2.glm.nb2) Now, when I run a 3rd. model to remove rain, it shows after summary that rain has been removed. Once I do a forth model rain is again in it. How can I remove more than one variable in one go? I was thinking doing something like this: model2.glm.nb2 <-update(model1.glm.nb, ~ . -rain, -temp, data=climate.dat) summary(model2.glm.nb2) Obviously using the - sign is not working and I am getting an error message. Thank you for your help. Michelle [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] gnls not optimizing values
Sometimes when I try to fit a model to data using the gnls function, it doesn't return optimized values of the model parameters. Instead it just returns the exact same values I used as initial guesses, but with standard errors calculated. Example is as follows: two_site_mHb <- gnls(y ~ (CS_AH2 + CS_AH1*10^(n1*(x-pKa1)) + CS_A_*10^ ((n1+1)*x-n1*pKa1-pKa2)) / (1 + 10^(n1*(x-pKa1)) + 10^((n1+1)*x- n1*pKa1-pKa2)), start=list(CS_AH2=y[1], CS_A_=y[length(y)], CS_AH1=y [length(y)/2], pKa1=3.3, pKa2=5.9, n1=1.5)) and the results are: Generalized nonlinear least squares fit Model: y ~ (CS_AH2 + CS_AH1 * 10^(n1 * (x - pKa1)) + CS_A_ * 10^ ((n1 + 1) * x - n1 * pKa1 - pKa2))/(1 + 10^(n1 * (x - pKa1)) + 10^((n1 + 1) * x - n1 * pKa1 - pKa2)) Data: NULL AIC BIC logLik -112.2501 -105.6390 63.12505 Coefficients: Value Std.Errort-value p-value CS_AH2 8.390 0.0081958 1023.6969 0 CS_A_ 8.629 0.0051456 1676.9804 0 CS_AH1 8.663 0.0085524 1012.9321 0 pKa13.300 0.0443557 74.3985 0 pKa25.900 0.4938263 11.9475 0 n1 1.500 0.22881796.5554 0 Correlation: CS_AH2 CS_A_ CS_AH1 pKa1 pKa2 CS_A_ -0.044 CS_AH1 -0.288 0.179 pKa10.436 0.083 0.413 pKa20.179 -0.504 -0.681 -0.295 n1 0.641 -0.095 -0.585 0.065 0.373 Standardized residuals: Min Q1 Med Q3 Max -1.85468073 -0.10985684 0.05285202 0.45446735 1.81225021 Residual standard error: 0.01055068 Degrees of freedom: 19 total; 13 residual Any thoughts as to why it's just spitting back the initial guesses, and how it might be avoided? Thanks. -Brian Doctrow [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R project
Dear Sir/Madam Could you pls help me at these subjects in R-project? How could I construct a table for 5 or 10 records of data set? How could I construct a bar chart for categorical variable with overlay? How could I construct a histogram for a categorical variable? pls inform me. Thanks in advanced. Navid _ Hotmail: Powerful Free email with security by Microsoft. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Estimated Standard Error for Theta in zeroinfl()
On 15/02/2010, at 12:49 PM, Lam, Tzeng Yih wrote: > Dear R Users, > > When using zeroinfl() function to fit a Zero-Inflated Negative Binomial > (ZINB) model to a dataset, the summary() gives an estimate of log(theta) and > its standard error, z-value and Pr(>|z|) for the count component. > Additionally, it also provided an estimate of Theta, which I believe is the > exp(estimate of log(theta)). > > However, if I would like to have an standard error of Theta itself (not the > SE.logtheta), how would I obtain or calculate that standard error? It's tricky. The exp and log functions aren't linear!!! So nothing really works very well. To start with, if you have an unbiased estimate of log(theta), say lt.hat, then exp(lt.hat) is NOT an unbiased estimate of theta. If you are willing to assume that lt.hat is normally distributed then the expected value of exp(lt.hat) is theta*exp(sigma^2/2) where sigma^2 is the variance of lt.hat. The variance of exp(lt.hat) is (*) theta^2 * exp(sigma^2) * (exp(sigma^2) - 1). You can ``plug in'' the SE of lt.hat for sigma into the foregoing and get an ***``approximately''*** unbiased estimate of theta: theta.hat = exp(lt.hat - SE^2/2) and then an approximate estimate of the variance of this ``theta.hat'' (by plugging in theta.hat for theta and SE for sigma in (*)). The results won't be correct, but they'll probably be in the right ball park. I think! This is all posited on the distribution of the estimate of log(theta) being normal (or ``Gaussian''). Whether this is a justifiable assumption in your setting is questionable. Some simulation experiments might be illuminating. cheers, Rolf Turner P. S. The formulae I gave about are the result of quickly scribbled calculations, and could be wrong. They should be checked. R. T. ## Attention: This e-mail message is privileged and confidential. If you are not the intended recipient please delete the message and notify the sender. Any views or opinions presented are solely those of the author. This e-mail has been scanned and cleared by MailMarshal www.marshalsoftware.com ## __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adding a regression line to an xyplot
On Feb 14, 2010, at 6:33 PM, Andrew McFadden wrote: Hi R users I am trying to add a regression line to a graph for "c" for factor 2 only. Any suggestions? library(lattice) a=(1:10) b=c(1:5,16:20) c=as.factor(c(rep(1,5),rep(2,5))) d=data.frame(a,b,c) xyplot(a~b, pch=c(6,8),data = tavg, groups=d$c, Seems likely that you want the dataframe defined above to be used as the argument to 'data='. Otherwise xyplot throws an error unless "tavg" is defined elsewhere. If so, then try: tavg=data.frame(a,b,c) xyplot(a~b, pch=c(6,8),data = tavg, groups=c, panel=function(x,y, groups,...) { panel.xyplot(x,y,groups,type=c("p"),...); panel.lmline(x[c=="2"],y[c=="2"],groups,...)}, xlab="a",ylab="b") reg.line=lm, smooth=FALSE, type=c("p","g"),xlab="a",ylab="b") I would appreciate your help. Kind regards andy Andrew McFadden MVS BVSc Incursion Investigator Investigation & Diagnostic Centres - Wallaceville Biosecurity New Zealand Ministry of Agriculture and Forestry Phone 04 894 5600 Fax 04 894 4973 Mobile 029 894 5611 Postal address: Investigation and Diagnostic Centre- Wallaceville Box 40742 Ward St Upper Hutt This email message and any attachment(s) is intended solely for the addressee(s) named above. The information it contains is confidential and may be legally privileged. Unauthorised use of the message, or the information it contains, may be unlawful. If you have received this message by mistake please call the sender immediately on 64 4 8940100 or notify us by return email and erase the original message and attachments. Thank you. The Ministry of Agriculture and Forestry accepts no responsibility for changes made to this email or to any attachments after transmission from the office. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Estimated Standard Error for Theta in zeroinfl()
Dear R Users, When using zeroinfl() function to fit a Zero-Inflated Negative Binomial (ZINB) model to a dataset, the summary() gives an estimate of log(theta) and its standard error, z-value and Pr(>|z|) for the count component. Additionally, it also provided an estimate of Theta, which I believe is the exp(estimate of log(theta)). However, if I would like to have an standard error of Theta itself (not the SE.logtheta), how would I obtain or calculate that standard error? Thank you very much for your time. Best regards, Tzeng Yih Lam -- PhD Candidate Department of Forest Engineering, Resources and Management College of Forestry Oregon State University 321 Richardson Hall Corvallis OR 97330 USA Phone: +1.541.713.7504 Fax: +1.541.713.7504 -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Adding a regression line to an xyplot
Hi R users > I am trying to add a regression line to a graph for "c" for factor 2 > only. Any suggestions? > library(lattice) a=(1:10) b=c(1:5,16:20) c=as.factor(c(rep(1,5),rep(2,5))) d=data.frame(a,b,c) xyplot(a~b, pch=c(6,8),data = tavg, groups=d$c, reg.line=lm, smooth=FALSE, type=c("p","g"),xlab="a",ylab="b") > I would appreciate your help. > > Kind regards > > andy > > Andrew McFadden MVS BVSc > Incursion Investigator > Investigation & Diagnostic Centres - Wallaceville Biosecurity New > Zealand Ministry of Agriculture and Forestry > > Phone 04 894 5600 Fax 04 894 4973 Mobile 029 894 5611 Postal address: > Investigation and Diagnostic Centre- Wallaceville Box 40742 Ward St > Upper Hutt > This email message and any attachment(s) is intended solely for the addressee(s) named above. The information it contains is confidential and may be legally privileged. Unauthorised use of the message, or the information it contains, may be unlawful. If you have received this message by mistake please call the sender immediately on 64 4 8940100 or notify us by return email and erase the original message and attachments. Thank you. The Ministry of Agriculture and Forestry accepts no responsibility for changes made to this email or to any attachments after transmission from the office. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plot different regression models on one graph
Dear Rhonda, Consider curve(). KeithC. -Original Message- From: Rhonda Reidy [mailto:rre...@gmail.com] Sent: Saturday, February 13, 2010 11:36 AM To: r-help@r-project.org Subject: [R] Plot different regression models on one graph The following variables have the following significant relationships (x is the explanatory variable): linear, cubic, exponential, logistic. The linear relationship plots without any trouble. Cubic is the 'best' model, but it is not plotting as a smooth curve using the following code: cubic.lm<- lm(y~poly(x,3)) lines(x,predict(cubic.lm),lwd=2) How do I plot the data and the estimated curves for all of these regression models in the same plot? x <- c(62.5,68.5,0,52,0,52,0,52,23.5,86,0,0,0,0,0,0,0,0,0,0) y <- c(0.054,0.055,0.017,0.021,0.020,0.028,0.032,0.073,0.076,0.087,0.042,0.042,0. 041,0.045,0.021,0.018,0.017,0.018,0.028,0.022) Thanks in advance. Rhonda Reidy [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plot different regression models on one graph
I would use all of the data. If you want to "drop" one, control for it in the model & sacrifice a degree of freedom. Why the call to poly() by the way? KeithC. -Original Message- From: Peter Ehlers [mailto:ehl...@ucalgary.ca] Sent: Saturday, February 13, 2010 1:35 PM To: David Winsemius Cc: Rhonda Reidy; r-help@r-project.org Subject: Re: [R] Plot different regression models on one graph Rhonda: As David points out, a cubic fit is rather speculative. I think that one needs to distinguish two situations: 1) theoretical justification for a cubic model is available, or 2) you're dredging the data for the "best" fit. Your case is the second. That puts you in the realm of EDA (exploratory data analysis). You're free to fit any model you wish, but you should assess the value of the model. One convenient way is to use the influence.measures() function, which will show that points 9 and 10 in your data have a strong influence on your cubic fit (as, of course, your eyes would tell you). A good thing to do at this point is to fit 3 more cubic models: 1) omitting point 9, 2) omitting point 10, 3) omitting both. Try it and plot the resulting fits. You may be surprised. Conclusion: Without more data, you can conclude nothing about a non-straightline fit. (And this hasn't even addressed the relative abundance of x=0 data.) -Peter Ehlers David Winsemius wrote: > > On Feb 13, 2010, at 1:35 PM, Rhonda Reidy wrote: > >> The following variables have the following significant relationships >> (x is the explanatory variable): linear, cubic, exponential, logistic. >> The linear relationship plots without any trouble. >> >> Cubic is the 'best' model, but it is not plotting as a smooth curve >> using the following code: >> >> cubic.lm<- lm(y~poly(x,3)) > > Try: > > lines(0:80, predict(cubic.lm, data.frame(x=0:80)),lwd=2) > > But I really must say the superiority of that relationship over a > linear one is far from convincing to my eyes. Seems to be caused by > one data point. I hope the quotes around "best" mean tha tyou have the same qualms. > > >> lines(x,predict(cubic.lm),lwd=2) >> >> How do I plot the data and the estimated curves for all of these >> regression models in the same plot? >> >> x <- c(62.5,68.5,0,52,0,52,0,52,23.5,86,0,0,0,0,0,0,0,0,0,0) >> >> y <- >> c(0.054,0.055,0.017,0.021,0.020,0.028,0.032,0.073,0.076,0.087,0.042,0 >> .042,0.041,0.045,0.021,0.018,0.017,0.018,0.028,0.022) >> >> >> Thanks in advance. >> >> Rhonda Reidy >> -- Peter Ehlers University of Calgary __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Dimensional reduction package
Hi Maura, > Is there any R package which implements non-linear dimensionality reduction > (LLE, ISOMAP, GTM, and so on) and/or intrinsic dimensionality estimation ? I am not exactly sure what is meant by non-linear dimensionality reduction but there is an isomap function in vegan. This is the isomap of Tenenbaum et al. 2000 and De'ath 1999. library(vegan) ?isomap De'ath, G. (1999) Extended dissimilarity: a method of robust estimation of ecological distances from high beta diversity data. Plant Ecology 144, 191–199 Tenenbaum, J.B., de Silva, V. & Langford, J.C. (2000) A global network framework for nonlinear dimensionality reduction. Science 290, 2319–2323. Hope this helps, Michael > Thank you, > Maura > > > tutti i telefonini TIM! > > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Michael Denslow I.W. Carpenter Jr. Herbarium [BOON] Department of Biology Appalachian State University Boone, North Carolina U.S.A. -- AND -- Communications Manager Southeast Regional Network of Expertise and Collections sernec.org 36.214177, -81.681480 +/- 3103 meters __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lattice/ylim: how to fix ylim[1], but have ylim[2] dynamically calculated?
On 15/02/2010, at 9:40 AM, Johannes Graumann wrote (In response to some advice from David Winsemius): > I am quite certain that this is the most elaborately worded version of > "RTFM" I have ever come across. I nominate this as a fortune. (Despite Prof. Winsemius's later protestation that his advice was *not* a version of RTFM.) cheers, Rolf Turner ## Attention:\ This e-mail message is privileged and confid...{{dropped:9}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] xyplot, overlay two variables on one plot with group factors
On Feb 14, 2010, at 3:44 PM, Pat Schmitz wrote: All I want to overlay two variables on the same plot following their appropriate grouping. I have attempted to use subscripting in panel with panel.xyplot, but I can't get the grouping to follow into the panel...here is an example... dat<-data.frame( y= log(1:10), y2=10:19, x=1:10, grp = as.factor(1) ) dat2<-data.frame( y= log(10:19), y2= 20:29, x=1:10, grp = as.factor(c(2)) ) dat<-rbind(dat, dat2) # Here I plot both response variables on y axis by the same x xyplot(y2 ~ x, dat, groups=grp, type="l") xyplot(y ~ x, dat, groups=grp, type="l") # I want to overlay both in the same plot to compare xyplot(y~ x, dat, groups = grp, ylim = c(0,40), panel=function(x,y,subscripts, groups,...){ panel.xyplot(x, y, type="l") panel.xyplot(dat$x[subscripts], dat$y2[subscripts],type="l") } ) My reading of the xyplot help page made me think that "subscripts" was logical and not an indexing variable, although the opposite is suggested by panel.superpose's help page. (Just my continued confusion about how to use lattice functions properly.) It sounds as though you are trying to superimpose what would otherwise be separate panels. So I'm never quite sure that when I mess around with panel arguments that I'm doing it right, but this produces what I think you are trying to do: xyplot(y~ x, dat, groups = grp, ylim = c(0,40), panel=function(x,y, groups,...){ panel.xyplot(x, y, groups, type="l", ...) panel.superpose(x, dat$y2, groups, type="l", ...) }) My hypothesis after experimenting with trying to leave out the , ... argument is that it is necessary to accept the subscripts argument, even though you don't actually need to do anything with it. I think that is implied by the fact that "subscripts (with no default) appears before the ellipsis in the arguments list. Thanks -- Patrick Schmitz Graduate Student Plant Biology 1206 West Gregory Drive RM 1500 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] xyplot, overlay two variables on one plot with group factors
Ok...I realized that if I don't wish to compare the groupings directly i can plot the groups as separate panels, which is a much more simple solution, but doesn't necessarily give the same effect. xyplot(y+y2 ~ x | grp, dat) On Sun, Feb 14, 2010 at 3:19 PM, Dennis Murphy wrote: > Hi: > > Here's one approach: the idea is to essentially stack the responses in both > data sets, create a factor that identifies the stacked variables, merges > the data > together and then combines 'variable' and 'grp' from the merged data frame > into a single factor for use as the groups = element in xyplot. (Got all > that?) > > library(lattice) > library(reshape) > # use the melt function of reshape to stack the responses and create an > # identifier (variable) for them. The responses themselves are put into a > # variable named 'value'. id.vars are kept separate, the others are stacked > # into a variable identifier called 'variable' and the vector of values > called > # 'values'... > Dat <- melt(dat, id.vars = c('x', 'grp')) > Dat2 <- melt(dat2, id.vars = c('x', 'grp')) > Dat3 <- rbind(Dat, Dat2) # concatenate the two data frames > Dat3$gv <- with(Dat3, paste(variable, grp, sep = "")) # combine factors > > xyplot(value ~ x, data = Dat3, groups = gv, type = "l") > > There are more elegant ways to create the grouping factor, which you may > need if you intend to create a 'nice' legend, but at this level it works. > > HTH, > Dennis > > On Sun, Feb 14, 2010 at 12:44 PM, Pat Schmitz wrote: > >> All >> >> I want to overlay two variables on the same plot following their >> appropriate >> grouping. I have attempted to use subscripting in panel with >> panel.xyplot, >> but I can't get the grouping to follow into the panel...here is an >> example... >> >> >> dat<-data.frame( >> y= log(1:10), >> y2=10:19, >> x=1:10, >> grp = as.factor(1) >> ) >> >> dat2<-data.frame( >> y= log(10:19), >> y2= 20:29, >> x=1:10, >> grp = as.factor(c(2)) >> ) >> >> dat<-rbind(dat, dat2) >> >> # Here I plot both response variables on y axis by the same x >> xyplot(y2 ~ x, dat, groups=grp, type="l") >> xyplot(y ~ x, dat, groups=grp, type="l") >> >> # I want to overlay both in the same plot to compare >> xyplot(y~ x, dat, >> groups = grp, >> ylim = c(0,40), >> panel=function(x,y,subscripts, groups,...){ >> panel.xyplot(x, y, type="l") >> panel.xyplot(dat$x[subscripts], dat$y2[subscripts],type="l") >> } >> ) >> >> >> Thanks >> >> -- >> Patrick Schmitz >> Graduate Student >> Plant Biology >> 1206 West Gregory Drive >> RM 1500 >> >>[[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > -- Patrick Schmitz Graduate Student Plant Biology 1206 West Gregory Drive RM 1500 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lattice/ylim: how to fix ylim[1], but have ylim[2] dynamically calculated?
On Feb 14, 2010, at 3:40 PM, Johannes Graumann wrote: David Winsemius wrote: On Feb 14, 2010, at 10:33 AM, Johannes Graumann wrote: Hello, When drawing "barcharts", I find it not helpful if ylim[1] != 0 - bars for a quantity of 0, that do not show a length of 0 are quite non- intuitive. I have tried to study library(lattice) panel.barchart but am unable to figure out where ylim is taken care of and how one might fix ylim[1] to 0 for barcharts ... Can anyone point out how to tackle this? Looking at Sarkar's "Lattice" text in chapter 8 section 3 "Limits and Aspect Ratio", it appears from subsection 1 that the prepanel function can used to supply values of xlim and ylim values. From subsection 2 he clarifies that xlim and ylim can also be specified on a per panel basis (and here I am guessing that this would be within a scales argument) when relation="free". At the end of that section he offers two examples using ylim: the first is not plotted but the second uses the prepanel mechanism for Fig 8.1 and that is probably available on the Lattice website. In the same subsection is offered an alternative to specifying an explicit scales$y$limits to be interpreted as ylim values. My hope it that these ideas and references will be of some use in identifying productive places to look for further documentation. I am quite certain that this is the most elaborately worded version of "RTFM" I have ever come across. Well, not really, it was a version of ... here's my best untested wild guess from someone who persistently struggles with trying to get panel arguments right. I shall go and do so. Sincerely, Joh -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multiple-test correlation
At 07:35 AM 2/14/2010, Manuel Jesús López Rodríguez wrote: Dear all, I am trying to study the correlation between one "independent" variable ("V1") and several others dependent among them ("V2","V3","V4" and "V5"). For doing so, I would like to analyze my data by multiple-test (applying the Bonferroni´s correction or other similar), but I do not find the proper command in R. What I want to do is to calculate Kendall´s correlation between "V1" and the others variables (i.e. "V1" vs "V2", "V1" vs "V3", etc.) and to correct the p values by Bonferroni or other. I have found "outlier.test", but I do not know if this is what I need (also, I would prefer to use a less conservative method than Bonferroni´s, if possible). Thank you very much in advance! One approach might be to first test for any correlations via a likelihood ratio test: Ho: P = I (no correlations) or covariances are diagonal T = -a ln V ~ chi-square [p(p-1)/2] where V = det(R) a = N -1 - (2 p +5)/6 N = # data p = # variables Reject Ho if T > X^2 (alpha, p(p-1)/2) Then do the pairwise tests without familywise error control. I.e., this is similar to doing the F test in ANOVA before doing LSD testing. Robert A. LaBudde, PhD, PAS, Dpl. ACAFS e-mail: r...@lcfltd.com Least Cost Formulations, Ltd.URL: http://lcfltd.com/ 824 Timberlake Drive Tel: 757-467-0954 Virginia Beach, VA 23464-3239Fax: 757-467-2947 "Vere scire est per causas scire" __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lattice/ylim: how to fix ylim[1], but have ylim[2] dynamically calculated?
Deepayan Sarkar wrote: > On Sun, Feb 14, 2010 at 7:33 AM, Johannes Graumann > wrote: >> Hello, >> >> When drawing "barcharts", I find it not helpful if ylim[1] != 0 - bars >> for a quantity of 0, that do not show a length of 0 are quite >> non-intuitive. >> >> I have tried to study >> > library(lattice) >> > panel.barchart >> but am unable to figure out where ylim is taken care of and how one might >> fix ylim[1] to 0 for barcharts ... >> >> Can anyone point out how to tackle this? > > Are you sure you are not looking for 'origin=0' (described in > ?panel.barchart)? I sure am - thank you! Following the same path for "bwplot" I found the embarrassingly simple answer to my earlier question regarding "varwidth" in ... Sincerely, Joh __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] xyplot, overlay two variables on one plot with group factors
All I want to overlay two variables on the same plot following their appropriate grouping. I have attempted to use subscripting in panel with panel.xyplot, but I can't get the grouping to follow into the panel...here is an example... dat<-data.frame( y= log(1:10), y2=10:19, x=1:10, grp = as.factor(1) ) dat2<-data.frame( y= log(10:19), y2= 20:29, x=1:10, grp = as.factor(c(2)) ) dat<-rbind(dat, dat2) # Here I plot both response variables on y axis by the same x xyplot(y2 ~ x, dat, groups=grp, type="l") xyplot(y ~ x, dat, groups=grp, type="l") # I want to overlay both in the same plot to compare xyplot(y~ x, dat, groups = grp, ylim = c(0,40), panel=function(x,y,subscripts, groups,...){ panel.xyplot(x, y, type="l") panel.xyplot(dat$x[subscripts], dat$y2[subscripts],type="l") } ) Thanks -- Patrick Schmitz Graduate Student Plant Biology 1206 West Gregory Drive RM 1500 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] sas.get and Windows: WAS RE: SAS and RODBC
> -Original Message- > From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de] > Sent: Sunday, February 14, 2010 7:09 AM > To: Daniel Nordlund > Cc: r-help@r-project.org > Subject: Re: [R] SAS and RODBC > > > > On 14.02.2010 08:19, Daniel Nordlund wrote: > >> -Original Message- > >> From: Frank E Harrell Jr [mailto:f.harr...@vanderbilt.edu] > >> Sent: Saturday, February 13, 2010 5:49 AM > >> To: Daniel Nordlund > >> Cc: r-help@r-project.org > >> Subject: Re: [R] SAS and RODBC > >> <<>> > >> Daniel since you have SAS BASE installed why not use sas.get in the > >> Hmisc package and also get access to metadata such as variable labels > >> that ODBC does not handle? Besides providing better documentation, the > >> labels are very useful as axis labels in plotting, etc. > >> > >> Frank > >> > > > > Frank, > > > > I have used sas.get from Hmisc before, and I will continue to use it. I > appreciate the work that you and your colleagues have done with Hmisc and > the Design and rms packages. However, the sas.get function still appears > to be broken on Windows platforms (or Windows is broken :-). I know how > to fix the problem, but I am always looking for approaches where I don't > have to fix things. It may well be that the better documentation > provided by sas.get will prove to out weigh the inconvenience of having to > source an edited version of sas.get for my regular use. > > Daniel, > > if the function is broken and needs to be fixed, why don't you report > your findings (both the error with reproducible code) as well as your > bugfixes to the package maintainer? > > Best wishes, > Uwe > > > <<>> Uwe, The solution has been posted on R-help more than once, but you are right, I should have specifically emailed the maintainer of the package. So, I just did. Thanks for the prompt. Dan Daniel Nordlund Bothell, WA USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lattice/ylim: how to fix ylim[1], but have ylim[2] dynamically calculated?
David Winsemius wrote: > > On Feb 14, 2010, at 10:33 AM, Johannes Graumann wrote: > >> Hello, >> >> When drawing "barcharts", I find it not helpful if ylim[1] != 0 - >> bars for a >> quantity of 0, that do not show a length of 0 are quite non-intuitive. >> >> I have tried to study >> > library(lattice) >> > panel.barchart >> but am unable to figure out where ylim is taken care of and how one >> might >> fix ylim[1] to 0 for barcharts ... >> >> Can anyone point out how to tackle this? > > Looking at Sarkar's "Lattice" text in chapter 8 section 3 "Limits and > Aspect Ratio", it appears from subsection 1 that the prepanel function > can used to supply values of xlim and ylim values. From subsection 2 > he clarifies that xlim and ylim can also be specified on a per panel > basis (and here I am guessing that this would be within a scales > argument) when relation="free". At the end of that section he offers > two examples using ylim: the first is not plotted but the second uses > the prepanel mechanism for Fig 8.1 and that is probably available on > the Lattice website. > > In the same subsection is offered an alternative to specifying an > explicit scales$y$limits to be interpreted as ylim values. > > My hope it that these ideas and references will be of some use in > identifying productive places to look for further documentation. I am quite certain that this is the most elaborately worded version of "RTFM" I have ever come across. I shall go and do so. Sincerely, Joh __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with specifying variance-covariance matrix for random effects (nlme package)
Hi all, I've been struggling with trying to specify a diagnoal matrix for linear mixed effects model. I think I've got nearly everything correct, except the following message appears: In lme.formula(fixed = fwave ~ sex + sexXbulbar + visit + age + : Fewer observations than random effects in all level 1 groups Not sure if i've provided enough details, but I'm basically trying to perform a mixed effects model analysis, controlling for several covariates. Incorporating random effects for the intercept and slope. What does the second line, 'Fewer observations than random effects in all level 1 groups' mean? Many thanks in advance, Ben Cheah __ [[elided Yahoo spam]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] unexpected results with higher-order functions and lapply
Hi, I believe it's lazy evaluation. See ?force HTH, baptiste On 14 February 2010 20:32, Jyotirmoy Bhattacharya wrote: > I want to use lapply and a function returning a function in order to build a > list of functions. > >> genr1 <- function(k) {function() {k}} >> l1 <- lapply(1:2,genr1) >> l1[[1]]() > [1] 2 > > This was unexpected. I had expected the answer to be 1, since that is the > value k should be bound to when genr1 is applied to the first element of > 1:2. > > By itself genr1 seems to work fine. >> genr1(5)() > [1] 5 > > I defined a slightly different higher-order function: >> genr2 <- function(k) {k;function() {k}} >> l2 <- lapply(1:2,genr2) >> l2[[1]]() > [1] 1 > > This gives the answer I expected. > > Now I am confused. The function returned by genr2 is exactly the same > function that was being returned by genr1. Why should evaluating k make a > difference? > > I am using R 2.9.2 on Ubuntu Linux. > > Jyotirmoy Bhattacharya > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems with boxplot in ggplot2:qplot
My bad: once I ran dev.off(), I did get a plot, albeit a blank one. Then I removed xlim - which I put in after qplot's complain about xlim - and voila! Thanks a lot. -- View this message in context: http://n4.nabble.com/Problems-with-boxplot-in-ggplot2-qplot-tp1555338p1555352.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] unexpected results with higher-order functions and lapply
I want to use lapply and a function returning a function in order to build a list of functions. > genr1 <- function(k) {function() {k}} > l1 <- lapply(1:2,genr1) > l1[[1]]() [1] 2 This was unexpected. I had expected the answer to be 1, since that is the value k should be bound to when genr1 is applied to the first element of 1:2. By itself genr1 seems to work fine. > genr1(5)() [1] 5 I defined a slightly different higher-order function: > genr2 <- function(k) {k;function() {k}} > l2 <- lapply(1:2,genr2) > l2[[1]]() [1] 1 This gives the answer I expected. Now I am confused. The function returned by genr2 is exactly the same function that was being returned by genr1. Why should evaluating k make a difference? I am using R 2.9.2 on Ubuntu Linux. Jyotirmoy Bhattacharya [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems with boxplot in ggplot2:qplot
... Unfortunately, a problem remains: I cannot label x ticks a la 'names.arg = '. month has values like '2009-01-01', '2009-02-01', etc., while I would prefer 'Jan', 'Feb'. Using closed$month = format(closed$month, "%b") disrupts the order of plot's panels, which now follows the alphabetic order of month names. -- View this message in context: http://n4.nabble.com/Problems-with-boxplot-in-ggplot2-qplot-tp1555338p1555358.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Newbie woes with *apply
Your function is not receiving what you may think its receiving. Try this: > x <- Sys.Date() + 1:3 > junk <- lapply(x, print) [1] 14655 [1] 14656 [1] 14657 So add this as the first statement in the body of each of your functions: date <- as.Date(date, "1970-01-01") and then using x from above try these: lapply(x, first.day.of.quarter) xx <- x; xx[] <- sapply(x, first.day.of.quarter) On Sun, Feb 14, 2010 at 8:59 AM, Dimitri Shvorob wrote: > > Dataframe cust has Date-type column open.date. I wish to set up another > column, with (first day of) the quarter of open.date. > > To be comprehensive (of course, improvement suggestions are welcome), > > month = function(date) > { > return(as.numeric(format(date,"%m"))) > } > > first.day.of.month = function(date) > { > return(date + 1 - as.numeric(format(date,"%d"))) > } > > first.day.of.quarter = function(date) > { > t = seq.Date(first.day.of.month(date), by = "-1 month", length = > month(date) %% 3) > return(t[length(t)]) > } > > Now the main part, > >> cust$open.quarter = apply(cust$open.date, 1, FUN = first.day.of.quarter) > Error in apply(cust$open.date, 1, FUN = first.day.of.quarter) : > dim(X) must have a positive length > >> cust$open.quarter = tapply(cust$open.date, FUN = first.day.of.quarter) > Error in tapply(cust$open.date, FUN = first.day.of.quarter) : > element 1 is empty; > the part of the args list of 'is.list' being evaluated was: > (INDEX) > >> cust$open.quarter = lapply(cust$open.date, FUN = first.day.of.quarter) > Error in prettyNum(.Internal(format(x, trim, digits, nsmall, width, 3L, : > invalid 'trim' argument > > Can anyone suggest the right syntax? > > Thank you. > > -- > View this message in context: > http://n4.nabble.com/Newbie-woes-with-apply-tp1555149p1555149.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems with boxplot in ggplot2:qplot
Hi, it's hard to tell what's wrong without a reproducible example, but I noted two things: - AFAIK there is no plot method for ggplot2. You probably meant print(p) instead - if you map x to factor(month), I think it will be incompatible with your xlim values range(month). HTH, baptiste On 14 February 2010 19:55, Dimitri Shvorob wrote: > > Dataframe closed contains balances of closed accounts: each row has month of > closure (Date-type column month) and latest balance. I would like to plot > by-month distributions of balances. A qplot call below produces several > warnings and no output. > > Can anyone help? > > Thank you. > > PS. A really basic task, very similar to the examples on p. 71 of the > ggplot2 book, apart from a Date grouping column; I am quite surprised to > have problems with it. lattice package to the rescue? > > >> qplot(factor(month), balance, data = closed, geom = "boxplot", xlim = >> range(closed$month)) > There were 13 warnings (use warnings() to see them) > >> warnings() > Warning messages: > 1: Removed 1 rows containing missing values (stat_boxplot). > 2: Removed 7 rows containing missing values (geom_point). > 3: Removed 5 rows containing missing values (geom_point). > 4: Removed 8 rows containing missing values (geom_point). > 5: Removed 3 rows containing missing values (geom_point). > 6: Removed 5 rows containing missing values (geom_point). > 7: Removed 2 rows containing missing values (geom_point). > 8: Removed 12 rows containing missing values (geom_point). > 9: Removed 2 rows containing missing values (geom_point). > 10: Removed 1 rows containing missing values (geom_point). > 11: Removed 2 rows containing missing values (geom_point). > 12: Removed 3 rows containing missing values (geom_point). > 13: Removed 4 rows containing missing values (geom_point). > >> p = qplot(factor(month), balance, data = closed, geom = "boxplot", xlim = >> range(closed$month)) >> plot(p) > Error in plot.window(...) : need finite 'xlim' values > -- > View this message in context: > http://n4.nabble.com/Problems-with-boxplot-in-ggplot2-qplot-tp1555338p1555338.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problems with boxplot in ggplot2:qplot
Dataframe closed contains balances of closed accounts: each row has month of closure (Date-type column month) and latest balance. I would like to plot by-month distributions of balances. A qplot call below produces several warnings and no output. Can anyone help? Thank you. PS. A really basic task, very similar to the examples on p. 71 of the ggplot2 book, apart from a Date grouping column; I am quite surprised to have problems with it. lattice package to the rescue? > qplot(factor(month), balance, data = closed, geom = "boxplot", xlim = > range(closed$month)) There were 13 warnings (use warnings() to see them) > warnings() Warning messages: 1: Removed 1 rows containing missing values (stat_boxplot). 2: Removed 7 rows containing missing values (geom_point). 3: Removed 5 rows containing missing values (geom_point). 4: Removed 8 rows containing missing values (geom_point). 5: Removed 3 rows containing missing values (geom_point). 6: Removed 5 rows containing missing values (geom_point). 7: Removed 2 rows containing missing values (geom_point). 8: Removed 12 rows containing missing values (geom_point). 9: Removed 2 rows containing missing values (geom_point). 10: Removed 1 rows containing missing values (geom_point). 11: Removed 2 rows containing missing values (geom_point). 12: Removed 3 rows containing missing values (geom_point). 13: Removed 4 rows containing missing values (geom_point). > p = qplot(factor(month), balance, data = closed, geom = "boxplot", xlim = > range(closed$month)) > plot(p) Error in plot.window(...) : need finite 'xlim' values -- View this message in context: http://n4.nabble.com/Problems-with-boxplot-in-ggplot2-qplot-tp1555338p1555338.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Newbie woes with *apply
Many thanks, but my focus is actually on *apply usage. -- View this message in context: http://n4.nabble.com/Newbie-woes-with-apply-tp1555149p1555329.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Keeping the attributes of an observation
Hey R list, Can someone help me with following question? I can create a listing of the unique observations in a dataframe by using: df <- df [order(df $start),] df $first_ob[!duplicated(df $ID)] <- "1" df $first_ob[duplicated(df $ID)] <- "0" Now I want to hold specific attributes of the first observation for other (the second or third ) observation of a specific element (named ID). So for example if I have an element with ID X that is ok in the first observation and not ok in the second I want to hold the ok in a therefore specified column. An example of the output would be: IDstartfirst obT1T2 X 1/1/20101okok X2/2/20100not okok Thanks in advance. Kind regards, John [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple missing values
Gary King's Amelia package for R and a stand alone version does EM algorithm multiple imputation. Joe King 206-913-2912 j...@joepking.com "Never throughout history has a man who lived a life of ease left a name worth remembering." --Theodore Roosevelt -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Frank E Harrell Jr Sent: Sunday, February 14, 2010 9:39 AM To: Patrick Burns Cc: r-help@r-project.org; john.macin...@ed.ac.uk Subject: Re: [R] Multiple missing values Patrick Burns wrote: > I can think of a few solutions, none perfect. > > * You could have a master dataset that has the > missing value codes you want, and a dataset that > you use which is a copy of it with real NA's in it. > > * You could add an attribute that gives the types > of missing values in the various positions. The > downside is that attributes tend to disappear with > subsetting. The sas.get function in the Hmisc exemplifies that approach, and it has a subsetting method that preserves the special.miss attribute. Frank > > * If you only have two types, you might be able to > get away with using NaN as the second type of NA. > > On 14/02/2010 14:33, John wrote: >> Does anyone know, or know documentation that describes, how to declare >> multiple values in R as missing that does not involve coding them as >> NA? I >> wish to be able to treate values as missing, while still retaining codes >> that describe the reason for the value being missing. >> >> Thanks >> >> John MAcInnes >> >> >> -- Frank E Harrell Jr Professor and ChairmanSchool of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] mlogit function cut off formular
I'm trying to fit a multinominal logistic model using package mlogit. I have 15 independent variables. The code looks like this: m<-mlogit(score~0|f1+f2+f3+f4+f5+f6+f7+f8+f9+f10+f11+f12+f13+f14+f15, data, reflevel="1") And it gives the following error message: Error in parse(text = x) : unexpected ')' in "score ~ 0 + alt:(f1 + f2 + f3 + f4 + f5 + f6 + f7 + f8 + f9 + f10 + f11 + f12 + )" Obviously, the mlogit function somehow cut off my formula, probably because it is too long. Is this considered a bug or I'm doing something wrond? Thanks -- View this message in context: http://n4.nabble.com/mlogit-function-cut-off-formular-tp1555298p1555298.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lattice/ylim: how to fix ylim[1], but have ylim[2] dynamically calculated?
On Sun, Feb 14, 2010 at 7:33 AM, Johannes Graumann wrote: > Hello, > > When drawing "barcharts", I find it not helpful if ylim[1] != 0 - bars for a > quantity of 0, that do not show a length of 0 are quite non-intuitive. > > I have tried to study > > library(lattice) > > panel.barchart > but am unable to figure out where ylim is taken care of and how one might > fix ylim[1] to 0 for barcharts ... > > Can anyone point out how to tackle this? Are you sure you are not looking for 'origin=0' (described in ?panel.barchart)? -Deepayan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] evaluate variable within expression - how?
Hi Mark, Try also: plot(1:10) text(2, 5, "Some text", font = 2) HTH, Jorge On Sun, Feb 14, 2010 at 11:36 AM, Mark Heckmann <> wrote: > # I want to plot bold text. The text should depend on a variable > containing a character string. > > plot.new() > text(.5, .5, expression(bold("Some text"))) > > # now I would like to do the same replacing "some text" by a variable. > > plot.new() > myText <- "some text" > text(.5, .5, expression(bold(myText))) > > # This obviouyls does not work. > # How can I combine substitute, parse, paste etc. do get what I want? > # And could someone add a brief explanation what happens, as I find > parse, expression, deparse, substitute etc. not easy to understand. > > Thanks, > Mark > > > Mark Heckmann > Dipl. Wirt.-Ing. cand. Psych. > Vorstraße 93 B01 > 28359 Bremen > Blog: www.markheckmann.de > R-Blog: http://ryouready.wordpress.com > > > > > >[[alternative HTML version deleted]] > > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lattice/ylim: how to fix ylim[1], but have ylim[2] dynamically calculated?
On Feb 14, 2010, at 10:33 AM, Johannes Graumann wrote: Hello, When drawing "barcharts", I find it not helpful if ylim[1] != 0 - bars for a quantity of 0, that do not show a length of 0 are quite non-intuitive. I have tried to study > library(lattice) > panel.barchart but am unable to figure out where ylim is taken care of and how one might fix ylim[1] to 0 for barcharts ... Can anyone point out how to tackle this? Looking at Sarkar's "Lattice" text in chapter 8 section 3 "Limits and Aspect Ratio", it appears from subsection 1 that the prepanel function can used to supply values of xlim and ylim values. From subsection 2 he clarifies that xlim and ylim can also be specified on a per panel basis (and here I am guessing that this would be within a scales argument) when relation="free". At the end of that section he offers two examples using ylim: the first is not plotted but the second uses the prepanel mechanism for Fig 8.1 and that is probably available on the Lattice website. In the same subsection is offered an alternative to specifying an explicit scales$y$limits to be interpreted as ylim values. My hope it that these ideas and references will be of some use in identifying productive places to look for further documentation. -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple missing values
Patrick Burns wrote: I can think of a few solutions, none perfect. * You could have a master dataset that has the missing value codes you want, and a dataset that you use which is a copy of it with real NA's in it. * You could add an attribute that gives the types of missing values in the various positions. The downside is that attributes tend to disappear with subsetting. The sas.get function in the Hmisc exemplifies that approach, and it has a subsetting method that preserves the special.miss attribute. Frank * If you only have two types, you might be able to get away with using NaN as the second type of NA. On 14/02/2010 14:33, John wrote: Does anyone know, or know documentation that describes, how to declare multiple values in R as missing that does not involve coding them as NA? I wish to be able to treate values as missing, while still retaining codes that describe the reason for the value being missing. Thanks John MAcInnes -- Frank E Harrell Jr Professor and ChairmanSchool of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] evaluate variable within expression - how?
Try text(.5, .5, bquote(bold(.(myText -Peter Ehlers Mark Heckmann wrote: # I want to plot bold text. The text should depend on a variable containing a character string. plot.new() text(.5, .5, expression(bold("Some text"))) # now I would like to do the same replacing "some text" by a variable. plot.new() myText <- "some text" text(.5, .5, expression(bold(myText))) # This obviouyls does not work. # How can I combine substitute, parse, paste etc. do get what I want? # And could someone add a brief explanation what happens, as I find parse, expression, deparse, substitute etc. not easy to understand. Thanks, Mark ––– Mark Heckmann Dipl. Wirt.-Ing. cand. Psych. Vorstraße 93 B01 28359 Bremen Blog: www.markheckmann.de R-Blog: http://ryouready.wordpress.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Ehlers University of Calgary __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] evaluate variable within expression - how?
Hi, Try with bquote, plot.new() myText <- "some text" text(.5, .5, bquote(bold(.(myText basically, bquote( .(myText) ) performs the substitution before applying bold() (see ?bquote). HTH, baptiste On 14 February 2010 17:36, Mark Heckmann wrote: > # I want to plot bold text. The text should depend on a variable > containing a character string. > > plot.new() > text(.5, .5, expression(bold("Some text"))) > > # now I would like to do the same replacing "some text" by a variable. > > plot.new() > myText <- "some text" > text(.5, .5, expression(bold(myText))) > > # This obviouyls does not work. > # How can I combine substitute, parse, paste etc. do get what I want? > # And could someone add a brief explanation what happens, as I find > parse, expression, deparse, substitute etc. not easy to understand. > > Thanks, > Mark > > ––– > Mark Heckmann > Dipl. Wirt.-Ing. cand. Psych. > Vorstraße 93 B01 > 28359 Bremen > Blog: www.markheckmann.de > R-Blog: http://ryouready.wordpress.com > > > > > > [[alternative HTML version deleted]] > > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Shapiro-Wilk for levels of factor
On Sun, 14 Feb 2010, Ravi Kulkarni wrote: Hello, I have data for an ANOVA where the between-subjects factor has three levels. How do I run a test of normality (using shapiro.test) on each of the levels of the factor for the dependent variable separately without creating extra datasets? You can use tapply(y, x, shapiro.test) which will then conduct as many Shapiro-Wilk tests as x has levels (without adjusting for multiple testing). Another approach might be to look at shapiro.test(residualslm(y ~ x))) which tests the null hypothesis that the residuals in all groups come from the same normal distribution. A worked example for the chickwts data is included below. Z ## data summary(chickwts) ## linear model and ANOVA fm <- lm(weight ~ feed, data = chickwts) anova(fm) ## QQ plot for residuals + Shapiro-Wilk test shapiro.test(residuals(fm)) ## separate tests for all groups of observations ## (with some formatting) do.call("rbind", with(chickwts, tapply(weight, feed, function(x) unlist(shapiro.test(x)[c("statistic", "p.value")] Thanks, Ravi __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Shapiro-Wilk for levels of factor
Hello, I have data for an ANOVA where the between-subjects factor has three levels. How do I run a test of normality (using shapiro.test) on each of the levels of the factor for the dependent variable separately without creating extra datasets? Thanks, Ravi __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] evaluate variable within expression - how?
# I want to plot bold text. The text should depend on a variable containing a character string. plot.new() text(.5, .5, expression(bold("Some text"))) # now I would like to do the same replacing "some text" by a variable. plot.new() myText <- "some text" text(.5, .5, expression(bold(myText))) # This obviouyls does not work. # How can I combine substitute, parse, paste etc. do get what I want? # And could someone add a brief explanation what happens, as I find parse, expression, deparse, substitute etc. not easy to understand. Thanks, Mark Mark Heckmann Dipl. Wirt.-Ing. cand. Psych. Vorstraße 93 B01 28359 Bremen Blog: www.markheckmann.de R-Blog: http://ryouready.wordpress.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple missing values
I can think of a few solutions, none perfect. * You could have a master dataset that has the missing value codes you want, and a dataset that you use which is a copy of it with real NA's in it. * You could add an attribute that gives the types of missing values in the various positions. The downside is that attributes tend to disappear with subsetting. * If you only have two types, you might be able to get away with using NaN as the second type of NA. On 14/02/2010 14:33, John wrote: Does anyone know, or know documentation that describes, how to declare multiple values in R as missing that does not involve coding them as NA? I wish to be able to treate values as missing, while still retaining codes that describe the reason for the value being missing. Thanks John MAcInnes __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Patrick Burns pbu...@pburns.seanet.com http://www.burns-stat.com (home of 'The R Inferno' and 'A Guide for the Unwilling S User') __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lattice/ylim: how to fix ylim[1], but have ylim[2] dynamically calculated?
Hello, When drawing "barcharts", I find it not helpful if ylim[1] != 0 - bars for a quantity of 0, that do not show a length of 0 are quite non-intuitive. I have tried to study > library(lattice) > panel.barchart but am unable to figure out where ylim is taken care of and how one might fix ylim[1] to 0 for barcharts ... Can anyone point out how to tackle this? Thanks, Joh __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Get rid of the first row of the LaTeX table generated by xtable
Dear Livlu and Uwe, This is exactly what I need, thanks. Shige On Sun, Feb 14, 2010 at 10:11 AM, Uwe Ligges wrote: > See ?print.xtable and its argument "include.rownames". > > Uwe Ligges > > On 14.02.2010 16:06, Shige Song wrote: >> >> Dear All, >> >> I am trying to generate a LaTeX table from a small data frame using >> xtable. I have three variable and 10 records. However, the resulted >> LaTeX table has four columns (instead of three), of which the first >> column seems to be an automatically generated ID. Is there a way to >> get rid of this column? >> >> Thanks. >> >> Shige >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple missing values
NA, Inf, -Inf, NaN would give you 4 possibilities and is.finite would check if its any of them: > x <- c(1, NA, 2, Inf, 3, -Inf, 4, NaN, 5) > is.finite(x) [1] TRUE FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE You might need to map them all to NA before using it with various functions depending on how the functions deal with these values. Other possibilities are to have an attribute with a factor defining the type of each NA. x <- c(1, NA, 2, NA, 3, NA) attr(x, "type.of.na") <- factor(c("A", "B", "A")) and depending on how much work you are prepared to do you could define a new R class that handles objects with such an attribute. On Sun, Feb 14, 2010 at 9:33 AM, John wrote: > Does anyone know, or know documentation that describes, how to declare > multiple values in R as missing that does not involve coding them as NA? I > wish to be able to treate values as missing, while still retaining codes > that describe the reason for the value being missing. > > Thanks > > John MAcInnes > > > -- > Professor John MacInnes > Sociology, > School of Social and Political Studies, > No 8 Buccleuch Place > University of Edinburgh > Edinburgh EH8 9LN > +44 (0)131 651 3867 > > Centre d'Estudis Demogràfics > Universitat Autònoma de Barcelona > Edifici E-2 > 08193 Bellaterra (Barcelona) > Spain > +34 93 581 3060 > "The University of Edinburgh is a charitable body, registered in Scotland, > with registration number SC005336." > > [[alternative HTML version deleted]] > > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Get rid of the first row of the LaTeX table generated by xtable
See ?print.xtable and its argument "include.rownames". Uwe Ligges On 14.02.2010 16:06, Shige Song wrote: Dear All, I am trying to generate a LaTeX table from a small data frame using xtable. I have three variable and 10 records. However, the resulted LaTeX table has four columns (instead of three), of which the first column seems to be an automatically generated ID. Is there a way to get rid of this column? Thanks. Shige __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Get rid of the first row of the LaTeX table generated by xtable
On 2/14/10, Shige Song wrote: > column seems to be an automatically generated ID. Is there a way to > get rid of this column? > Perhaps ?print.xtable include.rownames=F Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SAS and RODBC
On 14.02.2010 08:19, Daniel Nordlund wrote: -Original Message- From: Frank E Harrell Jr [mailto:f.harr...@vanderbilt.edu] Sent: Saturday, February 13, 2010 5:49 AM To: Daniel Nordlund Cc: r-help@r-project.org Subject: Re: [R] SAS and RODBC Daniel Nordlund wrote: . . . This is just a quick follow-up to my previous post. Based on Prof. Ripley's response I went back and looked at the SAS log file and reread the RODBC help pages. The problem of writing a SAS dataset was solved by setting colQuote=NULL in the call to the odbcConnect() function. ch<- odbcConnect('sasodbc', believeNRows=FALSE, colQuote=NULL) I hope this will be useful to others who may have the SAS BASE product and want to do graphics or statistical analyses with their SAS data, but can't afford the high licensing fees for the SAS STAT and GRAPH modules. Thanks to Prof. Ripley for the fine RODBC package. Dan Daniel Nordlund Bothell, WA USA Daniel since you have SAS BASE installed why not use sas.get in the Hmisc package and also get access to metadata such as variable labels that ODBC does not handle? Besides providing better documentation, the labels are very useful as axis labels in plotting, etc. Frank Frank, I have used sas.get from Hmisc before, and I will continue to use it. I appreciate the work that you and your colleagues have done with Hmisc and the Design and rms packages. However, the sas.get function still appears to be broken on Windows platforms (or Windows is broken :-). I know how to fix the problem, but I am always looking for approaches where I don't have to fix things. It may well be that the better documentation provided by sas.get will prove to out weigh the inconvenience of having to source an edited version of sas.get for my regular use. Daniel, if the function is broken and needs to be fixed, why don't you report your findings (both the error with reproducible code) as well as your bugfixes to the package maintainer? Best wishes, Uwe As for moving data from R to SAS, I don't know of any methods other than the RODBC package with the SAS ODBC driver for writing SAS datasets. Yes, I can write to csv or other file types that SAS can import, but if I can eliminate extra steps when going from R to SAS then that is a plus for me. Thanks again for the great tools, Dan Daniel Nordlund Bothell, WA USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Get rid of the first row of the LaTeX table generated by xtable
Dear All, I am trying to generate a LaTeX table from a small data frame using xtable. I have three variable and 10 records. However, the resulted LaTeX table has four columns (instead of three), of which the first column seems to be an automatically generated ID. Is there a way to get rid of this column? Thanks. Shige __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Newbie woes with *apply
Try library(zoo) x <- as.Date(seq(1, 500, 50)) # test data as.Date(as.yearmon(x)) as.Date(as.yearqtr(x)) On Sun, Feb 14, 2010 at 8:59 AM, Dimitri Shvorob wrote: > > Dataframe cust has Date-type column open.date. I wish to set up another > column, with (first day of) the quarter of open.date. > > To be comprehensive (of course, improvement suggestions are welcome), > > month = function(date) > { > return(as.numeric(format(date,"%m"))) > } > > first.day.of.month = function(date) > { > return(date + 1 - as.numeric(format(date,"%d"))) > } > > first.day.of.quarter = function(date) > { > t = seq.Date(first.day.of.month(date), by = "-1 month", length = > month(date) %% 3) > return(t[length(t)]) > } > > Now the main part, > >> cust$open.quarter = apply(cust$open.date, 1, FUN = first.day.of.quarter) > Error in apply(cust$open.date, 1, FUN = first.day.of.quarter) : > dim(X) must have a positive length > >> cust$open.quarter = tapply(cust$open.date, FUN = first.day.of.quarter) > Error in tapply(cust$open.date, FUN = first.day.of.quarter) : > element 1 is empty; > the part of the args list of 'is.list' being evaluated was: > (INDEX) > >> cust$open.quarter = lapply(cust$open.date, FUN = first.day.of.quarter) > Error in prettyNum(.Internal(format(x, trim, digits, nsmall, width, 3L, : > invalid 'trim' argument > > Can anyone suggest the right syntax? > > Thank you. > > -- > View this message in context: > http://n4.nabble.com/Newbie-woes-with-apply-tp1555149p1555149.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Multiple missing values
Does anyone know, or know documentation that describes, how to declare multiple values in R as missing that does not involve coding them as NA? I wish to be able to treate values as missing, while still retaining codes that describe the reason for the value being missing. Thanks John MAcInnes -- Professor John MacInnes Sociology, School of Social and Political Studies, No 8 Buccleuch Place University of Edinburgh Edinburgh EH8 9LN +44 (0)131 651 3867 Centre d'Estudis Demogràfics Universitat Autònoma de Barcelona Edifici E-2 08193 Bellaterra (Barcelona) Spain +34 93 581 3060 "The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336." [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Newbie woes with *apply
Bug fix: first.day.of.quarter = function(date) { t = first.day.of.month(date) l = month(date) %% 3 if (l == 0) return(t) t = seq.Date(t, by = "-1 month", length = l) return(t[length(t)]) } But the *apply part still does not work. -- View this message in context: http://n4.nabble.com/Newbie-woes-with-apply-tp1555149p1555167.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Newbie woes with *apply
Dataframe cust has Date-type column open.date. I wish to set up another column, with (first day of) the quarter of open.date. To be comprehensive (of course, improvement suggestions are welcome), month = function(date) { return(as.numeric(format(date,"%m"))) } first.day.of.month = function(date) { return(date + 1 - as.numeric(format(date,"%d"))) } first.day.of.quarter = function(date) { t = seq.Date(first.day.of.month(date), by = "-1 month", length = month(date) %% 3) return(t[length(t)]) } Now the main part, > cust$open.quarter = apply(cust$open.date, 1, FUN = first.day.of.quarter) Error in apply(cust$open.date, 1, FUN = first.day.of.quarter) : dim(X) must have a positive length > cust$open.quarter = tapply(cust$open.date, FUN = first.day.of.quarter) Error in tapply(cust$open.date, FUN = first.day.of.quarter) : element 1 is empty; the part of the args list of 'is.list' being evaluated was: (INDEX) > cust$open.quarter = lapply(cust$open.date, FUN = first.day.of.quarter) Error in prettyNum(.Internal(format(x, trim, digits, nsmall, width, 3L, : invalid 'trim' argument Can anyone suggest the right syntax? Thank you. -- View this message in context: http://n4.nabble.com/Newbie-woes-with-apply-tp1555149p1555149.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help for programming a short code in R
Dear all, I'm looking for a person who could help me to program a short code in R. The code involves Bayesian analysis so some familiarity with WinBUGS or another package/ software dealing with Bayesian estimation would be helpful. I have an academic paper in which the code is described ("Abe, M. (2009), ""Counting your customers" one by one: A hierarchical Bayes extension to the Pareto/NBD model," Marketing science, Vol. 28 No. 3, pp. 541 - 53") as well as one of the datasets mentioned in this manuscript to test the code. My assumption is that the job does not take very long -- although I cannot give a precise estimate of the number of hours required. If anyone is interested, please let me know and I can send you an electronic copy of the manuscript mentioned above. Best, Michael Michael Haenlein Professor of Marketing ESCP Europe - The School of Management for Europe [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lm function in R
On Sat, Feb 13, 2010 at 7:09 PM, Daniel Malter wrote: > It seems to me that your question is more about the econometrics than about > R. Any introductory econometric textbook or compendium on econometrics will > cover this as it is a basic. See, for example, Greene 2006 or Wooldridge > 2002. > > Say X is your data matrix, that contains columns for each of the individual > variables (x), columns with their interactions, and one column of 1s for the > intercept. Let y be your dependent variable. Then, OLS estimates are > computed by > > X'X inverse X'y > > Or in R > > solve(t(X)%*%X)%*%t(X)%*%y But that is not the way that the coefficients are calculated in R. It is the formula that is given in text books but, like most text book formulas, is not suitable for computation, especially with large data sets and complex models. There are many practical ways to calculate the least squares coefficients in large data sets and they all involve decompositions. For a Hadoop environment a scatter-gather approach would be to form Cholesky decompositions of the cross-product of sections of rows of the model matrix then combine the decompositions. > Best, > Daniel > > > - > cuncta stricte discussurus > - > -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On > Behalf Of Something Something > Sent: Saturday, February 13, 2010 5:04 PM > To: Bert Gunter > Cc: r-help@r-project.org > Subject: Re: [R] lm function in R > > I tried.. > > mod = lm(Y ~ X1*X2*X3, na.action = na.exclude) > formula(mod) > > This produced > Y ~ X1 * X2 * X3 > > > When I typed just mod I got: > > Call: > lm(formula = Y ~ X1 * X2 * X3, na.action = na.exclude) > > Coefficients: > (Intercept) X11 X21 X31 X11:X21 X11:X31 > X21:X31 X11:X21:X31 > 177.9245 0.2005 2.4482 3.1216 0.8127 -26.6166 > -3.0398 29.6049 > > > I am trying to figure out how R computed all these coefficients. > > > > > > On Sat, Feb 13, 2010 at 1:30 PM, Bert Gunter wrote: > >> ?formula >> >> >> Bert Gunter >> Genentech Nonclinical Statistics >> >> -Original Message- >> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] >> On >> Behalf Of Something Something >> Sent: Saturday, February 13, 2010 1:24 PM >> To: Daniel Nordlund >> Cc: r-help@r-project.org >> Subject: Re: [R] lm function in R >> >> Thanks Dan. Yes that was very helpful. I didn't see the change from '*' >> to >> '+'. >> >> Seems like when I put * it means - interaction & when I put + it's not an >> interaction. >> >> Is it correct to assume then that... >> >> When I put + R evaluates the following equation: >> Y-Hat = b0 + b1X1 + b2X2 + . . . bkXk + 7 7 7 + bkXk >> >> >> But when I put * R evaluates the following equation; >> Y-Hat = b0 + b1X1 + b2x2 + ... + bkXk + b12 X12+ b13 X13 + + c >> >> Is this correct? If it is then can someone point me to any sources that >> will explain how the coefficients (such as b0... bk, b12.. , b123..) are >> calculated. I guess, one source is the R source code :) but is there any >> other documentation anywhere? >> >> Please let me know. Thanks. >> >> >> >> On Fri, Feb 12, 2010 at 5:54 PM, Daniel Nordlund >> wrote: >> >> > > -Original Message- >> > > From: r-help-boun...@r-project.org [mailto: >> r-help-boun...@r-project.org] >> > > On Behalf Of Something Something >> > > Sent: Friday, February 12, 2010 5:28 PM >> > > To: Phil Spector; r-help@r-project.org >> > > Subject: Re: [R] lm function in R >> > > >> > > Thanks for the replies everyone. Greatly appreciate it. Some >> progress, >> > > but >> > > now I am getting the following values when I don't use "as.factor" >> > > >> > > 13.14167 25.11667 28.34167 49.14167 40.39167 66.86667 >> > > >> > > Is that what you guys get? >> > >> > >> > If you look at Phil's response below, no, that is not what he got. The >> > difference is that you are specifying an interaction, whereas Phil did >> not >> > (because the equation you initially specified did not include an >> > interaction. Use Y ~ X1 + X2 instead of Y ~ X1*X2 for your formula. >> > >> > > >> > > >> > > On Fri, Feb 12, 2010 at 5:00 PM, Phil Spector >> > > wrote: >> > > >> > > > By converting the two variables to factors, you are fitting >> > > > an entirely different model. Leave out the as.factor stuff >> > > > and it will work exactly as you want it to. >> > > > >> > > > dat >> > > >> >> > > > V1 V2 V3 V4 >> > > > 1 s1 14 4 1 >> > > > 2 s2 23 4 2 >> > > > 3 s3 30 7 2 >> > > > 4 s4 50 7 4 >> > > > 5 s5 39 10 3 >> > > > 6 s6 67 10 6 >> > > > >> > > >> names(dat) = c('id','y','x1','x2') >> > > >> z = lm(y~x1+x2,dat) >> > > >> predict(z) >> > > >> >> > > > 1 2 3 4 5 6 15.16667 >> 24.7 >> > > > 27.7 46.7 40.16667 68.7 >> > > > >> > > > >> > > > - Ph
Re: [R] Labels on a pyramide
Pardon my ignorance here. The Jim code works. But the problem the text comes over the plot. Adjusting the laxlab and raxlab, to get more plot are, the text keeps being cut. And one more thing that might be added as a new functionality is an option to show the value of the bar. Thanks for your help Caveman On Sun, Feb 14, 2010 at 1:05 PM, Jim Lemon wrote: > On 02/14/2010 01:27 AM, Orvalho Augusto wrote: >> >> I am using pyramid.plot() from the plotrix package. >> ... >> The problem is (1) I do not want plot agelabels on the center and (2) >> I want plot different labels for each pair of the bars (one label for >> masculine and the other feminine). >> >> The data represent the 10 most frequent cancer in a group of individuals. >> > > Bueno troglodita, > You have discovered a deficiency in pyramid.plot, and for that you get an > answer to your question and brand new source code! > > source("pyramid.plot.R") > > after loading the plotrix package. Use it quickly, for it will be in the > next version of plotrix and then everyone (todo el mundo, hombre!) will be > using it. > > Jim > > # it must be wider than the default to accommodate the long labels > x11(width=10) > par(mar=pyramid.plot(dados$masfr,dados$femfr, > laxlab=c(0,10,20,30,40),raxlab=c(0,10,20,30,40), > main="Primeiras 10 cancros mais frequentes por sexo", > top.labels=c("Masculino", "Tipo de cancro", "Feminino"), > labels=dados[,c("maslab","femlab")], > xycol=rep("#ff",10),xxcol=rep("#ff88ff",10), gap=25)) > -- OpenSource Software Consultant CENFOSS (www.cenfoss.co.mz) SP Tech (www.sptech.co.mz) email: orvaq...@cenfoss.co.mz cell: +258828810980 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] multiple-test correlation
Dear all, I am trying to study the correlation between one "independent" variable ("V1") and several others dependent among them ("V2","V3","V4" and "V5"). For doing so, I would like to analyze my data by multiple-test (applying the Bonferroni´s correction or other similar), but I do not find the proper command in R. What I want to do is to calculate Kendall´s correlation between "V1" and the others variables (i.e. "V1" vs "V2", "V1" vs "V3", etc.) and to correct the p values by Bonferroni or other. I have found "outlier.test", but I do not know if this is what I need (also, I would prefer to use a less conservative method than Bonferroni´s, if possible). Thank you very much in advance! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Labels on a pyramide
Jim thanks for your great! I will try to use your source code. Caveman Ps: Reconheco a minha trogolodice e por isso pedi ajuda. Lamento perturbar a todos por isso. On Sun, Feb 14, 2010 at 1:05 PM, Jim Lemon wrote: > On 02/14/2010 01:27 AM, Orvalho Augusto wrote: >> >> I am using pyramid.plot() from the plotrix package. >> ... >> The problem is (1) I do not want plot agelabels on the center and (2) >> I want plot different labels for each pair of the bars (one label for >> masculine and the other feminine). >> >> The data represent the 10 most frequent cancer in a group of individuals. >> > > Bueno troglodita, > You have discovered a deficiency in pyramid.plot, and for that you get an > answer to your question and brand new source code! > > source("pyramid.plot.R") > > after loading the plotrix package. Use it quickly, for it will be in the > next version of plotrix and then everyone (todo el mundo, hombre!) will be > using it. > > Jim > > # it must be wider than the default to accommodate the long labels > x11(width=10) > par(mar=pyramid.plot(dados$masfr,dados$femfr, > laxlab=c(0,10,20,30,40),raxlab=c(0,10,20,30,40), > main="Primeiras 10 cancros mais frequentes por sexo", > top.labels=c("Masculino", "Tipo de cancro", "Feminino"), > labels=dados[,c("maslab","femlab")], > xycol=rep("#ff",10),xxcol=rep("#ff88ff",10), gap=25)) > -- OpenSource Software Consultant CENFOSS (www.cenfoss.co.mz) SP Tech (www.sptech.co.mz) email: orvaq...@cenfoss.co.mz cell: +258828810980 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Labels on a pyramide
On 02/14/2010 01:27 AM, Orvalho Augusto wrote: I am using pyramid.plot() from the plotrix package. ... The problem is (1) I do not want plot agelabels on the center and (2) I want plot different labels for each pair of the bars (one label for masculine and the other feminine). The data represent the 10 most frequent cancer in a group of individuals. Bueno troglodita, You have discovered a deficiency in pyramid.plot, and for that you get an answer to your question and brand new source code! source("pyramid.plot.R") after loading the plotrix package. Use it quickly, for it will be in the next version of plotrix and then everyone (todo el mundo, hombre!) will be using it. Jim # it must be wider than the default to accommodate the long labels x11(width=10) par(mar=pyramid.plot(dados$masfr,dados$femfr, laxlab=c(0,10,20,30,40),raxlab=c(0,10,20,30,40), main="Primeiras 10 cancros mais frequentes por sexo", top.labels=c("Masculino", "Tipo de cancro", "Feminino"), labels=dados[,c("maslab","femlab")], xycol=rep("#ff",10),xxcol=rep("#ff88ff",10), gap=25)) pyramid.plot<-function(xy,xx,labels=NA,top.labels=c("Male","Age","Female"), main="",laxlab=NULL,raxlab=NULL,unit="%",xycol,xxcol,gap=1, labelcex=1,mark.cat=NA,add=FALSE) { if(any(c(xy,xx)<0)) stop("Negative quantities not allowed") xydim<-dim(xy) if(length(labels)==1) labels<-1:xydim[1] ncats<-ifelse(is.null(dim(labels)),length(labels),length(labels[,1])) if(is.null(xydim)) { if(length(xy) != ncats || length(xx) != ncats) stop("xy, xx and labels must all be the same length") halfwidth<-ceiling(max(c(xy,xx)))+gap } else { if(length(xy[,1]) != ncats || length(xx[,1]) != ncats) stop("xy, xx and labels must all be the same length") halfwidth<-ceiling(max(c(rowSums(xy),rowSums(xx+gap } oldmar<-par("mar") if(!add) { par(mar=c(4,2,4,2)) plot(0,xlim=c(-halfwidth,halfwidth),ylim=c(0,ncats+1), type="n",axes=FALSE,xlab="",ylab="",xaxs="i",yaxs="i",main=main) if(is.null(laxlab)) { laxlab<-seq(halfwidth-gap,0,by=-1) axis(1,at=-halfwidth:-gap,labels=laxlab) } else axis(1,at=-(laxlab+gap),labels=laxlab) if(is.null(raxlab)) { raxlab<-0:(halfwidth-gap) axis(1,at=gap:halfwidth,labels=raxlab) } else axis(1,at=raxlab+gap,labels=raxlab) axis(2,at=1:ncats,labels=rep("",ncats),pos=gap,tcl=-0.25) axis(4,at=1:ncats,labels=rep("",ncats),pos=-gap,tcl=-0.25) if(!is.na(mark.cat)) boxed.labels(0,mark.cat,labels[mark.cat]) if(is.null(dim(labels))) text(0,1:ncats,labels,cex=labelcex) else { text(-gap*0.9,1:ncats,labels[,1],cex=labelcex,adj=0) text(gap*0.9,1:ncats,labels[,2],cex=labelcex,adj=1) } mtext(top.labels,3,0,at=c(-halfwidth/2,0,halfwidth/2), adj=0.5,cex=labelcex) mtext(c(unit,unit),1,2,at=c(-halfwidth/2,halfwidth/2)) } if(is.null(xydim)) { if(missing(xycol)) xycol<-rainbow(ncats) if(missing(xxcol)) xxcol<-rainbow(ncats) rect(-(xy+gap),1:ncats-0.4,rep(-gap,ncats),1:ncats+0.4,col=xycol) rect(rep(gap,ncats),1:ncats-0.4,(xx+gap),1:ncats+0.4,col=xxcol) } else { if(missing(xycol)) xycol<-rainbow(xydim[2]) if(missing(xxcol)) xxcol<-rainbow(xydim[2]) xystart<-xxstart<-rep(gap,ncats) for(i in 1:xydim[2]) { xycolor<-rep(xycol[i],ncats) xxcolor<-rep(xxcol[i],ncats) rect(-(xy[,i]+xystart),1:ncats-0.4,-xystart,1:ncats+0.4, col=xycolor) rect(xxstart,1:ncats-0.4,xx[,i]+xxstart,1:ncats+0.4, col=xxcolor) xystart<-xy[,i]+xystart xxstart<-xx[,i]+xxstart } } return(oldmar) } __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem in performing goodness of fit test in R.
Good that you got it to work somehow! Faiz: can you report the result, exactly as returned by R, of sum(c(1,1,1,1,1,1)/6) - 1 ?? On my Linux with R version 2.10.0 (2009-10-26) this gives 0. (And, by the way, did you really mean to say you were using "R 2.1.01"? Was this a typing error for "R 2.10.1"?) Ted. On 14-Feb-10 09:26:00, Faiz Rasool wrote: > Hi all, > > Following Denis's code, I am able to carry out goodness of fit test. > > However the responses I have received, indicate that the code I > mentioned in > a prior email was not incorrect. inclusion of rep(1,6)/6 made a > difference > for me. I am using windows xp and R 2.1.01. Does difference of > operating > system makes a difference? As without rep(1,6)/6 R would not carry out > chisq.test on my computer. The code I got it to work follows. > >> freq=c(22,21,22,27,22,36)#frequencies obtained after rolling the dice >> 150 >> times. >> prob=rep(1,6)/6#change made after Dennis's suggestion. >> chisq.test(freq,p=prob)#no error message received now. > > Chi-squared test for given probabilities > > data: freq > X-squared = 6.72, df = 5, p-value = 0.2423 > > thanks everyone, > Faiz. > > - Original Message - > From: "Ted Harding" > To: > Cc: "Faiz Rasool" > Sent: Sunday, February 14, 2010 2:02 PM > Subject: RE: [R] Problem in performing goodness of fit test in R. > > >> On 14-Feb-10 07:42:12, Faiz Rasool wrote: >>> I am trying to perform goodness of fit test using R. I am using this >>> website >>> http://wiener.math.csi.cuny.edu/Statistics/R/simpleR/stat013.html >>> for help. However, I am unable to carry out the test successfully. My >>> code follows. It is taken from the website just mentioned. >>> >> freq=c(22,21,22,27,22,36) # frequencies obtained after >># rolling the dice 150 times. >> prob=c(1,1,1,1,1,1)/6 # specify expected frequency for each category. >> chisq.test(freq,p=prob) # I do not know what this line means. >> # I just followed instructions on the >> # website. >>> >>> The erorr I receive is "erorr in chisq.test(freq,p=prob)/6 >>> probabilities must sum to 1" >>> >>> I am very new to R, so any help would be appreciated. >>> Faiz. >> >> I suspect that you must have made an error in entering the commands >> into R. Prime suspect: You did not have 6 1's in p -- for example >> you may have put >> >> prob=c(1,1,1,1,1)/6 >> >> (with only five). I copied your code (as trivially reformatted above) >> straight into R using copy&paste with the mouse, with results: >> >> freq=c(22,21,22,27,22,36) # frequencies obtained after >># rolling the dice 150 times. >> prob=c(1,1,1,1,1,1)/6 # specify expected frequency for each category. >> chisq.test(freq,p=prob) # I do not know what this line means. >> >> #Chi-squared test for given probabilities >> # data: freq >> # X-squared = 6.72, df = 5, p-value = 0.2423 >> # I just followed instructions on the >> # website. >> >> So it worked as it should work. Therefore something went wrong >> when you entered the code. Check your input! >> >> Hoping this helps, >> Ted. >> >> >> E-Mail: (Ted Harding) >> Fax-to-email: +44 (0)870 094 0861 >> Date: 14-Feb-10 Time: 09:02:01 >> -- XFMail -- > E-Mail: (Ted Harding) Fax-to-email: +44 (0)870 094 0861 Date: 14-Feb-10 Time: 10:53:34 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem in performing goodness of fit test in R.
Dear Faiz, On Sat, Feb 13, 2010 at 11:42 PM, Faiz Rasool wrote: > The erorr I receive is "erorr in chisq.test(freq,p=prob)/6 probabilities > must sum to 1" This error message seems a little off. It looks like the entire chi squared test was divided by 6. Notice that the /6 occurs *after* the closing parenthesis for the test. As others have noted, it is probably an input problem, but here is another sample of code you could try. ## all of the data is entered directly into the function. x is the data, p are the probabilities. ## the biggest difference is rather than dividing by 6 to make the probabilities sum to 1, the argument "rescale.p=T" will make sure that they sum to 1 chisq.test(x=c(22,21,22,27,22,36), p=c(1,1,1,1,1,1), rescale.p=T) Chi-squared test for given probabilities data: c(22, 21, 22, 27, 22, 36) X-squared = 6.72, df = 5, p-value = 0.2423 > > I am very new to R, so any help would be appreciated. > Faiz. > Best of luck to you! I have found the list to be very helpful and informative. Joshua -- Joshua Wiley Senior in Psychology University of California, Riverside http://www.joshuawiley.com/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem in performing goodness of fit test in R.
On 14-Feb-10 07:42:12, Faiz Rasool wrote: > I am trying to perform goodness of fit test using R. I am using this > website > http://wiener.math.csi.cuny.edu/Statistics/R/simpleR/stat013.html > for help. However, I am unable to carry out the test successfully. My > code follows. It is taken from the website just mentioned. > freq=c(22,21,22,27,22,36) # frequencies obtained after # rolling the dice 150 times. prob=c(1,1,1,1,1,1)/6 # specify expected frequency for each category. chisq.test(freq,p=prob) # I do not know what this line means. # I just followed instructions on the website. > > The erorr I receive is "erorr in chisq.test(freq,p=prob)/6 > probabilities must sum to 1" > > I am very new to R, so any help would be appreciated. > Faiz. I suspect that you must have made an error in entering the commands into R. Prime suspect: You did not have 6 1's in p -- for example you may have put prob=c(1,1,1,1,1)/6 (with only five). I copied your code (as trivially reformatted above) straight into R using copy&paste with the mouse, with results: freq=c(22,21,22,27,22,36) # frequencies obtained after # rolling the dice 150 times. prob=c(1,1,1,1,1,1)/6 # specify expected frequency for each category. chisq.test(freq,p=prob) # I do not know what this line means. #Chi-squared test for given probabilities # data: freq # X-squared = 6.72, df = 5, p-value = 0.2423 # I just followed instructions on the website. So it worked as it should work. Therefore something went wrong when you entered the code. Check your input! Hoping this helps, Ted. E-Mail: (Ted Harding) Fax-to-email: +44 (0)870 094 0861 Date: 14-Feb-10 Time: 09:02:01 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem in performing goodness of fit test in R.
On Sun, 2010-02-14 at 12:42 +0500, Faiz Rasool wrote: > I am trying to perform goodness of fit test using R. I am using this website > http://wiener.math.csi.cuny.edu/Statistics/R/simpleR/stat013.html for help. > However, I am unable to carry out the test successfully. My code follows. It > is taken from the website just mentioned. > freq=c(22,21,22,27,22,36) # frequencies obtained after rolling the dice 150 > times. > prob=c(1,1,1,1,1,1)/6 # specify expected frequency for each category. > chisq.test(freq,p=prob) # I do not know what this line means. I just followed > instructions on the website. > The erorr I receive is "erorr in chisq.test(freq,p=prob)/6 probabilities must > sum to 1" > > I am very new to R, so any help would be appreciated. > Faiz. Faiz, Well ... In my computer( Phenom X4 9650, runing Ubuntu 9.10 and R 2.10.1) the script work > freq=c(22,21,22,27,22,36) # frequencies obtained after rolling the dice 150 times. > prob=c(1,1,1,1,1,1)/6 # specify expected frequency for each category. > chisq.test(freq,p=prob) # I do not know what this line means Chi-squared test for given probabilities data: freq X-squared = 6.72, df = 5, p-value = 0.2423 About the third line You must read ?chisq.test for better know the command, but you execute one chi-square test with uniform probability distribution -- Bernardo Rangel Tura, M.D,MPH,Ph.D National Institute of Cardiology Brazil __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] histbackback
Dear all, does anyone know how I can run the histbackback function from R Commander? I am going to teach this in my class and I hesitate to ask my students to write code Thanks Jason __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.