Re: [R] Manually calculating values from aov() result
Dear Brian, As Duncan mentioned, the terms type-I, II, and III sums of squares originated in SAS. The type-II and III SSs computed by the Anova() function in the car package take a different computational approach than in SAS, but in almost all cases produce the same results. (I slightly regret using the "type-*" terminology for car::Anova() because of the lack of exact correspondence to SAS.) The standard R anova() function computes type-I (sequential) SSs. The focus, however, shouldn't be on the SSs, or how they're computed, but on the hypotheses that are tested. Briefly, the hypotheses for type-I tests assume that all terms later in the sequence are 0 in the population; type-II tests assume that interactions to which main effects are marginal (and higher-order interactions to which lower-order interactions are marginal) are 0. Type-III tests don't, e.g., assume that interactions to which a main effect are marginal are 0 in testing the main effect, which represents an average over levels of the factor(s) with which the factor in the main effect interact. The description of the hypotheses for type-III tests is even more complex if there are covariates. In my opinion, researchers are usually interested in the hypotheses for type-II tests. These matters are described in detail, for example, in my applied regression text <https://www.john-fox.ca/AppliedRegression/index.html>. I hope this helps, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://www.john-fox.ca/ -- On 2024-08-07 8:27 a.m., Brian Smith wrote: [You don't often get email from briansmith199...@gmail.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] Caution: External email. Hi, Thanks for this information. Is there any way to force R to use Type-1 SS? I think most textbooks use this only. Thanks and regards, On Wed, 7 Aug 2024 at 17:00, Duncan Murdoch wrote: On 2024-08-07 6:06 a.m., Brian Smith wrote: Hi, I have performed ANOVA as below dat = data.frame( 'A' = c(-0.3960025, -0.3492880, -1.5893792, -1.4579074, -4.9214873, -0.8575018, -2.5551363, -0.9366557, -1.4307489, -0.3943704), 'B' = c(2,1,2,2,1,2,2,2,2,2), 'C' = c(0,1,1,1,1,1,1,0,1,1)) summary(aov(A ~ B * C, dat)) However now I also tried to calculate SSE for factor C Mean = sapply(split(dat, dat$C), function(x) mean(x$A)) N = sapply(split(dat, dat$C), function(x) dim(x)[1]) N[1] * (Mean[1] - mean(dat$A))^2 + N[2] * (Mean[2] - mean(dat$A))^2 #1.691 But in ANOVA table the sum-square for C is reported as 0.77. Could you please help how exactly this C = 0.77 is obtained from aov() Your design isn't balanced, so there are several ways to calculate the SS for C. What you have calculated looks like the "Type I SS" in SAS notation, if I remember correctly, assuming that C enters the model before B. That's not what R uses; I think it is Type II SS. For some details about this, see https://mcfromnz.wordpress.com/2011/03/02/anova-type-ii-ss-explained/ __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regression performance when using summary() twice
Dear Christian, You're apparently using the glm.nb() function in the MASS package. Your function is peculiar in several respects. For example, you specify the model formula as a character string and then convert it into a formula, but you could just pass the formula to the function -- the conversion seems unnecessary. Similarly, you compute the summary for the model twice rather than just saving it in a local variable in your function. And the form of the function output is a bit strange, but I suppose you have reasons for that. The primary reason that your function is slow, however, is that the confidence intervals computed by confint() profile the likelihood, which requires refitting the model a number of times. If you're willing to use possibly less accurate Wald-based rather than likelihood-based confidence intervals, computed, e.g., by the Confint() function in the car package, then you could speed up the computation considerably, Using a model fit by example(glm.nb), library(MASS) example(glm.nb) microbenchmark::microbenchmark( Wald = car::Confint(quine.nb1, vcov.=vcov(quine.nb1), estimate=FALSE), LR = confint(quine.nb1) ) which produces Unit: microseconds expr min lq meanmedian uqmax Wald 136.366 161.13 222.0872 184.541 283.72386.466 LR 87223.031 88757.09 95162.8733 95761.568 97672.23 182734.048 neval 100 100 I hope this helps, Johm -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://www.john-fox.ca/ -- On 2024-06-21 10:38 a.m., c.bu...@posteo.jp wrote: [You don't often get email from c.bu...@posteo.jp. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] Caution: External email. Hello, I am not a regular R user but coming from Python. But I use R for several special task. Doing a regression analysis does cost some compute time. But I wonder when this big time consuming algorithm is executed and if it is done twice in my sepcial case. It seems that calling "glm()" or similar does not execute the time consuming part of the regression code. It seems it is done when calling "summary(model)". Am I right so far? If this is correct I would say that in my case the regression is down twice with the identical formula and data. Which of course is inefficient. See this code: my_function <- function(formula_string, data) { formula <- as.formula(formula_string) model <- glm.nb(formula, data = data) result = cbind(summary(model)$coefficients, confint(model)) result = as.data.frame(result) string_result = capture.output(summary(model)) return(list(result, string_result)) } I do call summary() once to get the "$coefficents" and a second time when capturing its output as a string. If this really result in computing the regression twice I ask myself if there is a R-way to make this more efficent? Best regards, Christian Buhtz __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Column names of model.matrix's output with contrast.arg
Dear Christophe and Ben, Also see the car package for replacements for contr.treatment(), contr.sum(), and contr.helmert() -- e.g., help("contr.Sum", package="car"). These functions have been in the car package for more than two decades, and AFAIK, no one uses them (including myself). I didn't write a replacement for contr.poly() because the current coefficient labeling seemed reasonably transparent. Best, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://www.john-fox.ca/ -- On 2024-06-17 4:29 p.m., Ben Bolker wrote: Caution: External email. It's sorta-kinda-obliquely-partially documented in the examples: zapsmall(cP <- contr.poly(3)) # Linear and Quadratic output: .L .Q [1,] -0.7071068 0.4082483 [2,] 0.000 -0.8164966 [3,] 0.7071068 0.4082483 FWIW the faux package provides better-named alternatives. On 2024-06-17 4:25 p.m., Christophe Dutang wrote: Thanks for your reply. It might good to document the naming convention in ?contrasts. It is hard to understand .L for linear, .Q for quadratic, .C for cubic and ^n for other degrees. For contr.sum, we could have used .Sum, .Sum… Maybe the examples ?model.matrix should use names in dd objects so that we observe when names are dropped. Kind regards, Christophe Le 14 juin 2024 à 11:45, peter dalgaard a écrit : You're at the mercy of the various contr.XXX functions. They may or may not set the colnames on the matrices that they generate. The rationales for (not) setting them is not perfectly transparent, but you obviously cannot use level names on contr.poly, so it uses .L, .Q, etc. In MASS, contr.sdif is careful about labeling the columns with the levels that are being diff'ed. For contr.treatment, there is a straightforward connection to 0/1 dummy variables, so level names there are natural. One could use levels in contr.sum and contr.helmert, but it might confuse users that comparisons are with the average of all levels or preceding levels. (It can be quite confusing when coding is +1 for male and -1 for female, so that the gender difference is twice the coefficient.) -pd On 14 Jun 2024, at 08:12 , Christophe Dutang wrote: Dear list, Changing the default contrasts used in glm() makes me aware how model.matrix() set column names. With default contrasts, model.matrix() use the level values to name the columns. However with other contrasts, model.matrix() use the level indexes. In the documentation, I don’t see anything in the documentation related to this ? It does not seem natural to have such a behavior? Any comment is welcome. An example is below. Kind regards, Christophe #example from ?glm counts <- c(18,17,15,20,10,20,25,13,12) outcome <- paste0("O", gl(3,1,9)) treatment <- paste0("T", gl(3,3)) X3 <- model.matrix(counts ~ outcome + treatment) X4 <- model.matrix(counts ~ outcome + treatment, contrasts = list("outcome"="contr.sum")) X5 <- model.matrix(counts ~ outcome + treatment, contrasts = list("outcome"="contr.helmert")) #check with original factor cbind.data.frame(X3, outcome) cbind.data.frame(X4, outcome) cbind.data.frame(X5, outcome) #same issue with glm glm.D93 <- glm(counts ~ outcome + treatment, family = poisson()) glm.D94 <- glm(counts ~ outcome + treatment, family = poisson(), contrasts = list("outcome"="contr.sum")) glm.D95 <- glm(counts ~ outcome + treatment, family = poisson(), contrasts = list("outcome"="contr.helmert")) coef(glm.D93) coef(glm.D94) coef(glm.D95) #check linear predictor cbind(X3 %*% coef(glm.D93), predict(glm.D93)) cbind(X4 %*% coef(glm.D94), predict(glm.D94)) - Christophe DUTANG LJK, Ensimag, Grenoble INP, UGA, France ILB research fellow Web: http://dutangc.free.fr __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dr. Benjamin Bolker Professor, Mathematics & Statistics and Biology, McMaster University Director, School of Computational Science and Engineering (Acting) Graduate chair, Mathe
Re: [R] Listing folders on One Drive
Dear Nick, See list.dirs(), which is documented in the same help file as list.files(). I hope this helps, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://www.john-fox.ca/ -- On 2024-05-20 9:36 a.m., Nick Wray wrote: [You don't often get email from nickmw...@gmail.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] Caution: External email. Hello I have lots of folders of individual Scottish river catchments on my uni One Drive. Each folder is labelled with the river name eg "Tay" and they are all in a folder named "Scotland" I want to list the folders on One Drive so that I can cross check that I have them all against a list of folders on my laptop. Can I somehow use list.files() - I've tried various things but none seem to work... Any help appreciated Thanks Nick Wray [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] x[0]: Can '0' be made an allowed index in R?
Hello Peter, Unless I too misunderstand your point, negative indices for removal do work with the Oarray package (though -0 doesn't work to remove the 0th element, since -0 == 0 -- perhaps what you meant): > library(Oarray) > v <- Oarray(1:10, offset=0) > v [0,] [1,] [2,] [3,] [4,] [5,] [6,] [7,] [8,] [9,] 123456789 10 > dim(v) [1] 10 > v[-1] [1] 1 3 4 5 6 7 8 9 10 > v[-0] [1] 1 Best, John On 2024-04-23 9:03 a.m., Peter Dalgaard via R-help wrote: Caution: External email. Doesn't sound like you got the point. x[-1] normally removes the first element. With 0-based indices, this cannot work. - pd On 22 Apr 2024, at 17:31 , Ebert,Timothy Aaron wrote: You could have negative indices. There are two ways to do this. 1) provide a large offset. Offset <- 30 for (i in -29 to 120) { print(df[i+Offset])} 2) use absolute values if all indices are negative. for (i in -200 to -1) {print(df[abs(i)])} Tim -Original Message- From: R-help On Behalf Of Peter Dalgaard via R-help Sent: Monday, April 22, 2024 10:36 AM To: Rolf Turner Cc: R help project ; Hans W Subject: Re: [R] x[0]: Can '0' be made an allowed index in R? [External Email] Heh. Did anyone bring up negative indices yet? -pd On 22 Apr 2024, at 10:46 , Rolf Turner wrote: See fortunes::fortune(36). cheers, Rolf Turner -- Honorary Research Fellow Department of Statistics University of Auckland Stats. Dep't. (secretaries) phone: +64-9-373-7599 ext. 89622 Home phone: +64-9-480-4619 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat/ .ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C02%7Ctebert%40ufl.edu %7C79ca6aadcaee4aa3241308dc62d986f6%7C0d4da0f84a314d76ace60a62331e1b84 %7C0%7C0%7C638493933686698527%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw MDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata= wmv9OYcMES0nElT9OAKTdjBk%2BB55bQ7BjxOuaVVkPg4%3D&reserved=0 PLEASE do read the posting guide http://www.r/ -project.org%2Fposting-guide.html&data=05%7C02%7Ctebert%40ufl.edu%7C79 ca6aadcaee4aa3241308dc62d986f6%7C0d4da0f84a314d76ace60a62331e1b84%7C0% 7C0%7C638493933686711061%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiL CJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=AP78X nfKrX6B0YVM0N76ty9v%2Fw%2BchHIytw33X7M9umE%3D&reserved=0 and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Use of geometric mean .. in good data analysis
Dear Martin, Helpful general advice, although it's perhaps worth mentioning that the geometric mean, defined e.g. naively as prod(x)^(1/length(x)), is necessarily 0 if there are any 0 values in x. That is, the geometric mean "works" in this case but isn't really informative. Best, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://www.john-fox.ca/ On 2024-01-22 12:18 p.m., Martin Maechler wrote: Caution: External email. Rich Shepard on Mon, 22 Jan 2024 07:45:31 -0800 (PST) writes: > A statistical question, not specific to R. I'm asking for > a pointer for a source of definitive descriptions of what > types of data are best summarized by the arithmetic, > geometric, and harmonic means. In spite of off-topic: I think it is a good question, not really only about geo-chemistry, but about statistics in applied sciences (and engineering for that matter). Something I sure good applied statisticians in the 1980's and 1990's would all know the answer of : To use the geometric mean instead of the arithmetic mean is basically *equivalent* to first log-transform the data and then work with that transformed data: Not just for computing average, but for more relevant modelling, inference, etc. John W Tukey (and several other of the grands of the time) had the log transform among the "First aid transformations": If the data for a continuous variable must all be positive it is also typically the case that the distribution is considerably skewed to the right. In such a case behave as a good human who sees another human in health distress: apply First Aid -- do the things you learned to do quickly without too much thought, because things must happen fast ---to hopefully save the other's life. Here: Do log transform all such variables with further ado, and only afterwards start your (exploratory and more) data analysis. Now, mean(log(y)) = log(geometricmean(y)), where mean() is the arithmetic mean as in R {mathematically; on the computer you need all.equal(), not '==' !!} I.e., according to Tukey and all the other experienced applied statisticians of the past, the geometric mean is the "best thing" to do for such positive right-skewed data in the same sense that the log-transform is the best "a priori" transformation for such data -- with the one advantage even that you need to fiddle with zeroes when log-transforming, whereas the geometric mean works already for zeroes. Martin > As an aquatic ecologist I see regulators apply the > geometric mean to geochemical concentrations rather than > using the arithmetic mean. I want to know whether the > geometric mean of a set of chemical concentrations (e.g., > in mg/L) is an appropriate representation of the expected > value. If not, I want to explain this to non-technical > decision-makers; if so, I want to understand why my > assumption is wrong. > TIA, > Rich > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and > more, see https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html and provide > commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Is there any design based two proportions z test?
Dear Md Kamruzzaman, I've copied this response to the r-help list, where you originally asked your question. That way, other people can follow the conversation, if they're interested and there will be a record of the solution. Please keep r-help in the loop See below: On 2024-01-17 9:47 p.m., Md. Kamruzzaman wrote: Caution: External email. Dear John Thank you so much for your reply. I have calculated the 95%CI of the separate two proportions by using the survey package. The code is given below. svyby(~Diabetes_Cate, ~Year, nhc, svymean, na=TRUE) Here: nhc is the weighted survey data. I understand your point that it is possible to calculate the 95%CI of the proportional difference manually. It is time consuming, that's why I was looking for a function with a design effect to calculate this easily. I couldn't find this kind of function. However, it will be okay for me to calculate this manually, if there are no functions like this. If you intend to do this computation once, it's not terribly time consuming. If you intend to do it repeatedly, you can write a simple function to do the calculation, probably in less time than it takes to search for one. For manual calculation, could you please share the formula? to calculate the 95%CI of proportional difference. Here's a simple function to compute the confidence interval, assuming that the normal distribution is used. The formula is based on the elementary result that the variance of the difference of two independent random variables is the sum of their variances, plus the observation that the width of the confidence interval is 2*z*SE, where z is the normal quantile corresponding to the confidence level (e.g., 1.96 for a 95% CI). ciDiff <- function(ci1, ci2, level=0.95){ p1 <- mean(ci1) p2 <- mean(ci2) z <- qnorm((1 - level)/2, lower.tail=FALSE) se1 <- (ci1[2] - ci1[1])/(2*z) se2 <- (ci2[2] - ci2[1])/(2*z) seDiff <- sqrt(se1^2 + se2^2) (p1 - p2) + c(-z, z)*seDiff } Example: Prevalence of Diabetes: 2011: 11.0 (95%CI 10.1-11.9) 2017: 10.1 (95%CI 9.4-10.9) Diff: 0.9% (95%CI: ??) These are percentages, not proportions, but you can use either: > ciDiff(c(10.1, 11.9), c(9.4, 10.9)) [1] -0.3215375 2.0215375 > ciDiff(c(.101, .119), c(.094, .109)) [1] -0.003215375 0.020215375 You'll want more significant digits in the inputs to get sufficiently precise results. Since I did this quickly, if I were you I'd check the results manually. Best, John With Kind Regards --------- */Md Kamruzzaman/* On Thu, Jan 18, 2024 at 12:44 AM John Fox <mailto:j...@mcmaster.ca>> wrote: Dear Md Kamruzzaman, To answer your second question first, you could just use the svychisq() function. The difference-of-proportion test is equivalent to a chisquare test for the 2-by-2 table. You don't say how you computed the confidence intervals for the two separate proportions, but if you have their standard errors (and if not, you should be able to infer them from the confidence intervals) you can compute the variance of the difference as the sum of the variances (squared standard errors), because the two proportions are independent, and from that the confidence interval for their difference. I hope this helps, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://www.john-fox.ca/ <https://www.john-fox.ca/> On 2024-01-16 10:21 p.m., Md. Kamruzzaman wrote: > [You don't often get email from mkzama...@gmail.com <mailto:mkzama...@gmail.com>. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification <https://aka.ms/LearnAboutSenderIdentification> ] > > Caution: External email. > > > Hello Everyone, > I was analysing big survey data using survey packages on RStudio. Survey > package allows survey data analysis with the design effect.The survey > package included functions for all other statistical analysis except > two-proportion z tests. > > I was trying to calculate the difference in prevalence of Diabetes and > Prediabetes between the year 2011 and 2017 (with 95%CI). I was able to > calculate the weighted prevalence of diabetes and prediabetes in the Year > 2011 and 2017 and just subtracted the prevalence of 2011 from the > prevalence of 2017 to get the difference in prevalence. But I could not > calculate the 95%CI of the difference in prevalence considering the weight > of the survey data. > >
Re: [R] Is there any design based two proportions z test?
Dear Md Kamruzzaman, To answer your second question first, you could just use the svychisq() function. The difference-of-proportion test is equivalent to a chisquare test for the 2-by-2 table. You don't say how you computed the confidence intervals for the two separate proportions, but if you have their standard errors (and if not, you should be able to infer them from the confidence intervals) you can compute the variance of the difference as the sum of the variances (squared standard errors), because the two proportions are independent, and from that the confidence interval for their difference. I hope this helps, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://www.john-fox.ca/ On 2024-01-16 10:21 p.m., Md. Kamruzzaman wrote: [You don't often get email from mkzama...@gmail.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] Caution: External email. Hello Everyone, I was analysing big survey data using survey packages on RStudio. Survey package allows survey data analysis with the design effect.The survey package included functions for all other statistical analysis except two-proportion z tests. I was trying to calculate the difference in prevalence of Diabetes and Prediabetes between the year 2011 and 2017 (with 95%CI). I was able to calculate the weighted prevalence of diabetes and prediabetes in the Year 2011 and 2017 and just subtracted the prevalence of 2011 from the prevalence of 2017 to get the difference in prevalence. But I could not calculate the 95%CI of the difference in prevalence considering the weight of the survey data. I was also trying to see if this difference in prevalence is statistically significant. I could do it using the simple two-proportion z test without considering the weight of the sample. But I want to do it considering the weight of the sample. Example: Prevalence of Diabetes: 2011: 11.0 (95%CI 10.1-11.9) 2017: 10.1 (95%CI 9.4-10.9) Diff: 0.9% (95%CI: ??) Proportion Z test P Value: ?? Your cooperation will be highly appreciated. Thanks in advance. With Regards ** *Md Kamruzzaman* *PhD **Research Fellow (**Medicine**)* Discipline of Medicine and Centre of Research Excellence in Translating Nutritional Science to Good Health Adelaide Medical School | Faculty of Health and Medical Sciences The University of Adelaide Adelaide SA 5005 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] car::deltaMethod() fails when a particular combination of categorical variables is not present
Dear Michael, My previous response was inaccurate: First, linearHypothesis() *is* able to accommodate aliased coefficients by setting the argument singular.ok = TRUE: > linearHypothesis(minimal_model, "bt2 + csent + bt2:csent = 0", + singular.ok=TRUE) Linear hypothesis test: bt2 + csent + bt2:csent = 0 Model 1: restricted model Model 2: a ~ b * c Res.DfRSS Df Sum of Sq F Pr(>F) 1 16 9392.1 2 15 9266.4 1125.67 0.2034 0.6584 Moreover, when there is an empty cell, this F-test is (for a reason that I haven't worked out, but is almost surely due to how the rank-deficient model is parametrized) *not* equivalent to the t-test for the corresponding coefficient in the raveled version of the two factors: > df$bc <- factor(with(df, paste(b, c, sep=":"))) > m <- lm(a ~ bc, data=df) > summary(m) Call: lm(formula = a ~ bc, data = df) Residuals: Min 1Q Median 3Q Max -57.455 -11.750 0.439 14.011 37.545 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept)20.50 17.57 1.166 0.2617 bct1:unsent37.50 24.85 1.509 0.1521 bct2:other 32.00 24.85 1.287 0.2174 bct2:sent 17.17 22.69 0.757 0.4610 <<< cf. F = 0.2034, p = 0.6584 bct2:unsent38.95 19.11 2.039 0.0595 Residual standard error: 24.85 on 15 degrees of freedom Multiple R-squared: 0.2613,Adjusted R-squared: 0.06437 F-statistic: 1.327 on 4 and 15 DF, p-value: 0.3052 In the full-rank case, however, what I said is correct -- that is, the F-test for the 1 df hypothesis on the three coefficients is equivalent to the t-test for the corresponding coefficient when the two factors are raveled: > linearHypothesis(minimal_model_fixed, "bt2 + csent + bt2:csent = 0") Linear hypothesis test: bt2 + csent + bt2:csent = 0 Model 1: restricted model Model 2: a ~ b * c Res.DfRSS Df Sum of Sq F Pr(>F) 1 15 9714.5 2 14 9194.4 1520.08 0.7919 0.3886 > df_fixed$bc <- factor(with(df_fixed, paste(b, c, sep=":"))) > m <- lm(a ~ bc, data=df_fixed) > summary(m) Call: lm(formula = a ~ bc, data = df_fixed) Residuals: Min 1Q Median 3Q Max -57.455 -11.750 0.167 14.011 37.545 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 64.000 25.627 2.497 0.0256 bct1:sent-43.500 31.387 -1.386 0.1874 bct1:unsent -12.000 36.242 -0.331 0.7455 bct2:other -11.500 31.387 -0.366 0.7195 bct2:sent-26.333 29.591 -0.890 0.3886 << cf. bct2:unsent -4.545 26.767 -0.170 0.8676 Residual standard error: 25.63 on 14 degrees of freedom Multiple R-squared: 0.2671,Adjusted R-squared: 0.005328 F-statistic: 1.02 on 5 and 14 DF, p-value: 0.4425 So, to summarize: (1) You can use linearHypothesis() with singular.ok=TRUE to test the hypothesis that you specified, though I suspect that this hypothesis probably isn't testing what you think in the rank-deficient case. I suspect that the hypothesis that you want to test is obtained by raveling the two factors. (2) There is no reason to use deltaMethod() for a linear hypothesis, but there is also no intrinsic reason that deltaMethod() shouldn't be able to handle a rank-deficient model. We'll probably fix that. My apologies for the confusion, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://www.john-fox.ca/ On 2023-09-26 9:49 a.m., John Fox wrote: Caution: External email. Dear Michael, You're testing a linear hypothesis, so there's no need to use the delta method, but the linearHypothesis() function in the car package also fails in your case: > linearHypothesis(minimal_model, "bt2 + csent + bt2:csent = 0") Error in linearHypothesis.lm(minimal_model, "bt2 + csent + bt2:csent = 0") : there are aliased coefficients in the model. One work-around is to ravel the two factors into a single factor with 5 levels: > df$bc <- factor(with(df, paste(b, c, sep=":"))) > df$bc [1] t2:unsent t2:unsent t2:unsent t2:unsent t2:sent t2:unsent [7] t2:unsent t1:sent t2:unsent t2:unsent t2:other t2:unsent [13] t1:unsent t1:sent t2:unsent t2:other t1:unsent t2:sent [19] t2:sent t2:unsent Levels: t1:sent t1:unsent t2:other t2:sent t2:unsent > m <- lm(a ~ bc, data=df) > summary(m) Call: lm(formula = a ~ bc, data = df) Residuals: Min 1Q Median 3Q Max -57.455 -11.750 0.439 14.011 37.545 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 20.50 17.57 1.166 0.2617 bct1:unsent 37.50 24.85 1.509 0.1521 bct2:other 32.00 24.85 1.287 0.2174 bct2:sent 17.17 22.69 0.757 0.4610 bct2:unsent 38.95 19.11 2.039 0.0595 Residual sta
Re: [R] car::deltaMethod() fails when a particular combination of categorical variables is not present
Dear Michael, You're testing a linear hypothesis, so there's no need to use the delta method, but the linearHypothesis() function in the car package also fails in your case: > linearHypothesis(minimal_model, "bt2 + csent + bt2:csent = 0") Error in linearHypothesis.lm(minimal_model, "bt2 + csent + bt2:csent = 0") : there are aliased coefficients in the model. One work-around is to ravel the two factors into a single factor with 5 levels: > df$bc <- factor(with(df, paste(b, c, sep=":"))) > df$bc [1] t2:unsent t2:unsent t2:unsent t2:unsent t2:sent t2:unsent [7] t2:unsent t1:sent t2:unsent t2:unsent t2:other t2:unsent [13] t1:unsent t1:sent t2:unsent t2:other t1:unsent t2:sent [19] t2:sent t2:unsent Levels: t1:sent t1:unsent t2:other t2:sent t2:unsent > m <- lm(a ~ bc, data=df) > summary(m) Call: lm(formula = a ~ bc, data = df) Residuals: Min 1Q Median 3Q Max -57.455 -11.750 0.439 14.011 37.545 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept)20.50 17.57 1.166 0.2617 bct1:unsent37.50 24.85 1.509 0.1521 bct2:other 32.00 24.85 1.287 0.2174 bct2:sent 17.17 22.69 0.757 0.4610 bct2:unsent38.95 19.11 2.039 0.0595 Residual standard error: 24.85 on 15 degrees of freedom Multiple R-squared: 0.2613,Adjusted R-squared: 0.06437 F-statistic: 1.327 on 4 and 15 DF, p-value: 0.3052 Then the hypothesis is tested directly by the t-value for the coefficient bct2:sent. I hope that this helps, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://www.john-fox.ca/ On 2023-09-26 1:12 a.m., Michael Cohn wrote: Caution: External email. I'm running a linear regression with two categorical predictors and their interaction. One combination of levels does not occur in the data, and as expected, no parameter is estimated for it. I now want to significance test a particular combination of levels that does occur in the data (ie, I want to get a confidence interval for the total prediction at given levels of each variable). In the past I've done this using car::deltaMethod() but in this dataset that does not work, as shown in the example below: The regression model gives the expected output, but deltaMethod() gives this error: error in t(gd) %*% vcov. : non-conformable arguments I believe this is because there is no parameter estimate for when the predictors have the values 't1' and 'other'. In the df_fixed dataframe, putting one person into that combination of categories causes deltaMethod() to work as expected. I don't know of any theoretical reason that missing one interaction parameter estimate should prevent getting a confidence interval for a different combination of predictors. Is there a way to use deltaMethod() or some other function to do this without changing my data? Thank you, - Michael Cohn Vote Rev (http://voterev.org) Demonstration: -- library(car) # create dataset with outcome and two categorical predictors outcomes <- c(91,2,60,53,38,78,48,33,97,41,64,84,64,8,66,41,52,18,57,34) persontype <- c("t2","t2","t2","t2","t2","t2","t2","t1","t2","t2","t2","t2","t1","t1","t2","t2","t1","t2","t2","t2") arm_letter <- c("unsent","unsent","unsent","unsent","sent","unsent","unsent","sent","unsent","unsent","other","unsent","unsent","sent","unsent","other","unsent","sent","sent","unsent") df <- data.frame(a = outcomes, b=persontype, c=arm_letter) # note: there are no records with the combination 't1' + 'other' table(df$b,df$c) #regression works as expected minimal_formula <- formula("a ~ b*c") minimal_model <- lm(minimal_formula, data=df) summary(minimal_model) #use deltaMethod() to get a prediction for individuals with the combination 'b2' and 'sent' # deltaMethod() fails with "error in t(gd) %*% vcov. : non-conformable arguments." deltaMethod(minimal_model, "bt2 + csent + `bt2:csent`", rhs=0) # duplicate the dataset and change one record to be in the previously empty cell df_fixed <- df df_fixed[c(13),"c"] <- 'other' table(df_fixed$b,df_fixed$c) #deltaMethod() now works minimal_model_fixed <- lm(minimal_formula, data=df_fixed) deltaMethod(minimal_model_fixed, "bt2 + csent + `bt2:csent`", rhs=0) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Print hypothesis warning- Car package
Hi Peter, On 2023-09-18 10:08 a.m., peter dalgaard wrote: Caution: External email. Also, I would guess that the code precedes the use of backticks in non-syntactic names. Indeed, by more than a decade (though modified in the interim). Could they be deployed here? I don't think so, at least not without changing how the function works. The problem doesn't occur when the hypothesis is specified symbolically as a character vector, including in equation form, only when the hypothesis matrix is given directly, in which case linearHypothesis() tries to construct the equation-form representation, again as character vectors. Its inability to do so when the coefficient names include arithmetic operators doesn't, I think, require a warning or even a message: the symbolic representation of the hypothesis can simply be omitted. The numeric results reported are entirely unaffected. I've made this change and will commit it to the next version of the car package. Thank you for the suggestion, John - Peter On 17 Sep 2023, at 16:43 , John Fox wrote: Dear Robert, Anova() calls linearHypothesis(), also in the car package, to compute sums of squares and df, supplying appropriate hypothesis matrices. linearHypothesis() usually tries to express the hypothesis matrix in symbolic equation form for printing, but won't do this if coefficient names include arithmetic operators, in your case - and +, which can confuse it. The symbolic form of the hypothesis isn't really relevant for Anova(), which doesn't use the printed representation of each hypothesis, and so, despite the warnings, you get the correct ANOVA table. In your case, where the data are balanced, with 4 cases per cell, Anova(mod) and summary(mod) are equivalent, which makes me wonder why you would use Anova() in the first place. To elaborate a bit, linearHypothesis() does tolerate arithmetic operators in coefficient names if you specify the hypothesis symbolically rather than as a hypothesis matrix. For example, to test, the interaction: --- snip linearHypothesis(mod, + c("TreatmentDabrafenib:ExpressionCD271+ = 0", +"TreatmentTrametinib:ExpressionCD271+ = 0", +"TreatmentCombination:ExpressionCD271+ = 0")) Linear hypothesis test Hypothesis: TreatmentDabrafenib:ExpressionCD271+ = 0 TreatmentTrametinib:ExpressionCD271+ = 0 TreatmentCombination:ExpressionCD271+ = 0 Model 1: restricted model Model 2: Viability ~ Treatment * Expression Res.Df RSS Df Sum of Sq F Pr(>F) 1 27 18966 2 24 16739 32226.3 1.064 0.3828 --- snip Alternatively: --- snip H <- matrix(0, 3, 8) H[1, 6] <- H[2, 7] <- H[3, 8] <- 1 H [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [1,]00000100 [2,]00000010 [3,]00000001 linearHypothesis(mod, H) Linear hypothesis test Hypothesis: Model 1: restricted model Model 2: Viability ~ Treatment * Expression Res.Df RSS Df Sum of Sq F Pr(>F) 1 27 18966 2 24 16739 32226.3 1.064 0.3828 Warning message: In printHypothesis(L, rhs, names(b)) : one or more coefficients in the hypothesis include arithmetic operators in their names; the printed representation of the hypothesis will be omitted --- snip There's no good reason that linearHypothesis() should try to express each hypothesis symbolically for Anova(), since Anova() doesn't use that information. When I have some time, I'll arrange to avoid the warning. Best, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://www.john-fox.ca/ On 2023-09-16 4:39 p.m., Robert Baer wrote: Caution: External email. When doing Anova using the car package, I get a print warning that is unexpected. It seemingly involves have my flow cytometry factor levels named CD271+ and CD171-. But I am not sure this warning should be intended behavior. Any explanation about whether I'm doing something wrong? Why can't I have CD271+ and CD271- as factor levels? Its legal text isn't it? library(car) mod = aov(Viability ~ Treatment*Expression, data = dat1) Anova(mod, type =2) Anova Table (Type II tests) Response: Viability Sum Sq Df F value Pr(>F) Treatment 19447.3 3 9.2942 0.0002927 *** Expression 2669.8 1 3.8279 0.0621394 . Treatment:Expression 2226.3 3 1.0640 0.3828336 Residuals 16739.3 24 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Warning messages: 1: In printHypothesis(L, rhs, names(b)) : one or more coefficients in the hypothesis include arithmetic operators in their names; the printed representation of the hypothesis will be omitted 2: In printHypothesis(L, rhs, names(b)) : one or more coefficients in the hypothesis include arithmetic operators in the
Re: [R] Print hypothesis warning- Car package
Dear Robert, Anova() calls linearHypothesis(), also in the car package, to compute sums of squares and df, supplying appropriate hypothesis matrices. linearHypothesis() usually tries to express the hypothesis matrix in symbolic equation form for printing, but won't do this if coefficient names include arithmetic operators, in your case - and +, which can confuse it. The symbolic form of the hypothesis isn't really relevant for Anova(), which doesn't use the printed representation of each hypothesis, and so, despite the warnings, you get the correct ANOVA table. In your case, where the data are balanced, with 4 cases per cell, Anova(mod) and summary(mod) are equivalent, which makes me wonder why you would use Anova() in the first place. To elaborate a bit, linearHypothesis() does tolerate arithmetic operators in coefficient names if you specify the hypothesis symbolically rather than as a hypothesis matrix. For example, to test, the interaction: --- snip > linearHypothesis(mod, + c("TreatmentDabrafenib:ExpressionCD271+ = 0", +"TreatmentTrametinib:ExpressionCD271+ = 0", +"TreatmentCombination:ExpressionCD271+ = 0")) Linear hypothesis test Hypothesis: TreatmentDabrafenib:ExpressionCD271+ = 0 TreatmentTrametinib:ExpressionCD271+ = 0 TreatmentCombination:ExpressionCD271+ = 0 Model 1: restricted model Model 2: Viability ~ Treatment * Expression Res.Df RSS Df Sum of Sq F Pr(>F) 1 27 18966 2 24 16739 32226.3 1.064 0.3828 --- snip Alternatively: --- snip > H <- matrix(0, 3, 8) > H[1, 6] <- H[2, 7] <- H[3, 8] <- 1 > H [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [1,]00000100 [2,]00000010 [3,]00000001 > linearHypothesis(mod, H) Linear hypothesis test Hypothesis: Model 1: restricted model Model 2: Viability ~ Treatment * Expression Res.Df RSS Df Sum of Sq F Pr(>F) 1 27 18966 2 24 16739 32226.3 1.064 0.3828 Warning message: In printHypothesis(L, rhs, names(b)) : one or more coefficients in the hypothesis include arithmetic operators in their names; the printed representation of the hypothesis will be omitted --- snip There's no good reason that linearHypothesis() should try to express each hypothesis symbolically for Anova(), since Anova() doesn't use that information. When I have some time, I'll arrange to avoid the warning. Best, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://www.john-fox.ca/ On 2023-09-16 4:39 p.m., Robert Baer wrote: Caution: External email. When doing Anova using the car package, I get a print warning that is unexpected. It seemingly involves have my flow cytometry factor levels named CD271+ and CD171-. But I am not sure this warning should be intended behavior. Any explanation about whether I'm doing something wrong? Why can't I have CD271+ and CD271- as factor levels? Its legal text isn't it? library(car) mod = aov(Viability ~ Treatment*Expression, data = dat1) Anova(mod, type =2) Anova Table (Type II tests) Response: Viability Sum Sq Df F value Pr(>F) Treatment 19447.3 3 9.2942 0.0002927 *** Expression 2669.8 1 3.8279 0.0621394 . Treatment:Expression 2226.3 3 1.0640 0.3828336 Residuals 16739.3 24 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Warning messages: 1: In printHypothesis(L, rhs, names(b)) : one or more coefficients in the hypothesis include arithmetic operators in their names; the printed representation of the hypothesis will be omitted 2: In printHypothesis(L, rhs, names(b)) : one or more coefficients in the hypothesis include arithmetic operators in their names; the printed representation of the hypothesis will be omitted 3: In printHypothesis(L, rhs, names(b)) : one or more coefficients in the hypothesis include arithmetic operators in their names; the printed representation of the hypothesis will be omitted The code to reproduce: ``` dat1 <-structure(list(Treatment = structure(c(1L, 1L, 1L, 1L, 3L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L), levels = c("Control", "Dabrafenib", "Trametinib", "Combination"), class = "factor"), Expression = structure(c(2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L), levels = c("CD271-", "CD271+"), class = "factor"),
Re: [R] Determining Starting Values for Model Parameters in Nonlinear Regression
Dear John, John, and Paul, In this case, one can start values by just fitting > lm(1/y ~ x1 + x2 + x3 - 1, data=mydata) Call: lm(formula = 1/y ~ x1 + x2 + x3 - 1, data = mydata) Coefficients: x1 x2 x3 0.00629 0.00868 0.00803 Of course, the errors enter this model differently, so this isn't the same as the nonlinear model, but the regression coefficients are very close to the estimates for the nonlinear model. Best, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://www.john-fox.ca/ On 2023-08-19 6:39 p.m., Sorkin, John wrote: Caution: External email. Colleagues, At the risk of starting a forest fire, or perhaps a brush fire, while it is good to see that nlxb can find a solution from arbitrary starting values, I think Paul’s question has merit despite Professor Nash’s excellent and helpful observation. Although non-linear algorithms can converge, they can converge to a false solution if starting values are sub-optimally specified. When possible, I try to specify thought-out starting values. Would it make sense to plot y as a function of (x1, x2) at different values of x3 to get a sense of possible starting values? Or, perhaps using median values of x1, x2, and x3 as starting values. Comparing results from different starting values can give some confidence that the solution obtained using arbitrary starting values are likely “correct”. I freely admit that my experience (and thus expertise) using non-linear solutions is limited. Please do not flame me, I am simply urging caution. John John David Sorkin M.D., Ph.D. Professor of Medicine Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) On Aug 19, 2023, at 4:35 PM, J C Nash mailto:profjcn...@gmail.com>> wrote: Why bother. nlsr can find a solution from very crude start. Mixture <- c(17, 14, 5, 1, 11, 2, 16, 7, 19, 23, 20, 6, 13, 21, 3, 18, 15, 26, 8, 22) x1 <- c(69.98, 72.5, 77.6, 79.98, 74.98, 80.06, 69.98, 77.34, 69.99, 67.49, 67.51, 77.63, 72.5, 67.5, 80.1, 69.99, 72.49, 64.99, 75.02, 67.48) x2 <- c(29, 25.48, 21.38, 19.85, 22, 18.91, 29.99, 19.65, 26.99, 29.49, 32.47, 20.35, 26.48, 31.47, 16.87, 27.99, 24.49, 31.99, 24.96, 30.5) x3 <- c(1, 2, 1, 0, 3, 1, 0, 2.99, 3, 3, 0, 2, 1, 1, 3, 2, 3, 3, 0, 2) y <- c(1.4287, 1.4426, 1.4677, 1.4774, 1.4565, 1.4807, 1.4279, 1.4684, 1.4301, 1.4188, 1.4157, 1.4686, 1.4414, 1.4172, 1.4829, 1.4291, 1.4438, 1.4068, 1.4524, 1.4183) mydata<-data.frame(Mixture, x1, x2, x3, y) mydata mymod <- y ~ 1/(Beta1*x1 + Beta2*x2 + Beta3*x3) library(nlsr) strt<-c(Beta1=1, Beta2=2, Beta3=3) trysol<-nlxb(formula=mymod, data=mydata, start=strt, trace=TRUE) trysol # or pshort(trysol) Output is residual sumsquares = 1.5412e-05 on 20 observations after 29Jacobian and 43 function evaluations namecoeff SE tstat pval gradient JSingval Beta1 0.00629212 5.997e-06 1049 2.425e-42 4.049e-08 721.8 Beta2 0.00867741 1.608e-05 539.7 1.963e-37 -2.715e-08 56.05 Beta3 0.00801948 8.809e-05 91.03 2.664e-24 1.497e-08 10.81 J Nash On 2023-08-19 16:19, Paul Bernal wrote: Dear friends, Hope you are all doing well and having a great weekend. I have data that was collected on specific gravity and spectrophotometer analysis for 26 mixtures of NG (nitroglycerine), TA (triacetin), and 2 NDPA (2 - nitrodiphenylamine). In the dataset, x1 = %NG, x2 = %TA, and x3 = %2 NDPA. The response variable is the specific gravity, and the rest of the variables are the predictors. This is the dataset: dput(mod14data_random) structure(list(Mixture = c(17, 14, 5, 1, 11, 2, 16, 7, 19, 23, 20, 6, 13, 21, 3, 18, 15, 26, 8, 22), x1 = c(69.98, 72.5, 77.6, 79.98, 74.98, 80.06, 69.98, 77.34, 69.99, 67.49, 67.51, 77.63, 72.5, 67.5, 80.1, 69.99, 72.49, 64.99, 75.02, 67.48), x2 = c(29, 25.48, 21.38, 19.85, 22, 18.91, 29.99, 19.65, 26.99, 29.49, 32.47, 20.35, 26.48, 31.47, 16.87, 27.99, 24.49, 31.99, 24.96, 30.5), x3 = c(1, 2, 1, 0, 3, 1, 0, 2.99, 3, 3, 0, 2, 1, 1, 3, 2, 3, 3, 0, 2), y = c(1.4287, 1.4426, 1.4677, 1.4774, 1.4565, 1.4807, 1.4279, 1.4684, 1.4301, 1.4188, 1.4157, 1.4686, 1.4414, 1.4172, 1.4829, 1.4291, 1.4438, 1.4068, 1.4524, 1.4183)), row.names = c(NA, -20L), class = "data.frame") The model is the following: y = 1/(Beta1x1 + Beta2x2 + Beta3x3) I need to determine starting (initial) values for the model parameters for this nonlinear regression model, any ideas on how to accomplish this using R? Cheers, Paul [[alternative HTML version deleted]]
Re: [R] Getting an error calling MASS::boxcox in a function
Hi Bert, On 2023-07-08 3:42 p.m., Bert Gunter wrote: Caution: This email may have originated from outside the organization. Please exercise additional caution with any links and attachments. Thanks John. ?boxcox says: * Arguments object a formula or fitted model object. Currently only lm and aov objects are handled. * I read that as saying that boxcox(lm(z+1 ~ 1),...) should run without error. But it didn't. And perhaps here's why: BoxCoxLambda <- function(z){ b <- MASS:::boxcox.lm(lm(z+1 ~ 1), lambda = seq(-5, 5, length.out = 61), plotit = FALSE) b$x[which.max(b$y)]# best lambda } lambdas <- apply(dd,2 , BoxCoxLambda) Error in NextMethod() : 'NextMethod' called from an anonymous function and, indeed, ?UseMethod says: "NextMethod should not be called except in methods called by UseMethod or from internal generics (see InternalGenerics). In particular it will not work inside anonymous calling functions (e.g., get("print.ts")(AirPassengers))." BUT BoxCoxLambda <- function(z){ b <- MASS:::boxcox(z+1 ~ 1, lambda = seq(-5, 5, length.out = 61), plotit = FALSE) b$x[which.max(b$y)]# best lambda } lambdas <- apply(dd,2 , BoxCoxLambda) lambdas [1] 0.167 0.167 As it turns out, it's the update() step in boxcox.lm() that fails, and the update takes place because $y is missing from the lm object, so the following works: BoxCoxLambda <- function(z){ b <- boxcox(lm(z + 1 ~ 1, y=TRUE), lambda = seq(-5, 5, length.out = 101), plotit = FALSE) b$x[which.max(b$y)] } The identical lambdas do not seem right to me; I think that's just an accident of the example (using the BoxCoxLambda() above): > apply(dd, 2, BoxCoxLambda, simplify = TRUE) [1] 0.2 0.2 > dd[, 2] <- dd[, 2]^3 > apply(dd, 2, BoxCoxLambda, simplify = TRUE) [1] 0.2 0.1 Best, John nor do I understand why boxcox.lm apparently throws the error while boxcox.formula does not (it also calls NextMethod()) So I would welcome clarification to clear my clogged (cerebral) sinuses. :-) Best, Bert On Sat, Jul 8, 2023 at 11:25 AM John Fox wrote: Dear Ron and Bert, First (and without considering why one would want to do this, e.g., adding a start of 1 to the data), the following works for me: -- snip -- > library(MASS) > BoxCoxLambda <- function(z){ + b <- boxcox(z + 1 ~ 1, + lambda = seq(-5, 5, length.out = 101), + plotit = FALSE) + b$x[which.max(b$y)] + } > mrow <- 500 > mcol <- 2 > set.seed(12345) > dd <- matrix(rgamma(mrow*mcol, shape = 2, scale = 5), nrow = mrow, ncol = +mcol) > dd1 <- dd[, 1] # 1st column of dd > res <- boxcox(lm(dd1 + 1 ~ 1), lambda = seq(-5, 5, length.out = 101), plotit + = FALSE) > res$x[which.max(res$y)] [1] 0.2 > apply(dd, 2, BoxCoxLambda, simplify = TRUE) [1] 0.2 0.2 -- snip -- One could also use the powerTransform() function in the car package, which in this context transforms towards *multi*normality: -- snip -- > library(car) Loading required package: carData > powerTransform(dd + 1) Estimated transformation parameters Y1Y2 0.1740200 0.2089925 I hope this helps, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://www.john-fox.ca/ On 2023-07-08 12:47 p.m., Bert Gunter wrote: Caution: This email may have originated from outside the organization. Please exercise additional caution with any links and attachments. No, I'm afraid I'm wrong. Something went wrong with my R session and gave me incorrect answers. After restarting, I continued to get the same error as you did with my supposed "fix." So just ignore what I said and sorry for the noise. -- Bert On Sat, Jul 8, 2023 at 8:28 AM Bert Gunter wrote: Try this for your function: BoxCoxLambda <- function(z){ y <- z b <- boxcox(y + 1 ~ 1,lambda = seq(-5, 5, length.out = 61), plotit = FALSE) b$x[which.max(b$y)]# best lambda } ***I think*** (corrections and clarification strongly welcomed!) that `~` (the formula function) is looking for 'z' in the GlobalEnv, the caller of apply(), and not finding it. It finds 'y' here explicitly in the BoxCoxLambda environment. Cheers, Bert On Sat, Jul 8, 2023 at 4:28 AM Ron Crump via R-help wrote: Hi, Firstly, apologies as I have posted this on community.rstudio.com too. I want to optimise a Box-Cox transformation on columns of a matrix (ie, a unique lambda for each column). So I wrote a function that includes the call to MASS::boxcox in order that it can be applied to each column easily. Except that I'm getting an error when calling the function. If I just extract a column of the matrix
Re: [R] Getting an error calling MASS::boxcox in a function
Dear Ron and Bert, First (and without considering why one would want to do this, e.g., adding a start of 1 to the data), the following works for me: -- snip -- > library(MASS) > BoxCoxLambda <- function(z){ + b <- boxcox(z + 1 ~ 1, + lambda = seq(-5, 5, length.out = 101), + plotit = FALSE) + b$x[which.max(b$y)] + } > mrow <- 500 > mcol <- 2 > set.seed(12345) > dd <- matrix(rgamma(mrow*mcol, shape = 2, scale = 5), nrow = mrow, ncol = +mcol) > dd1 <- dd[, 1] # 1st column of dd > res <- boxcox(lm(dd1 + 1 ~ 1), lambda = seq(-5, 5, length.out = 101), plotit + = FALSE) > res$x[which.max(res$y)] [1] 0.2 > apply(dd, 2, BoxCoxLambda, simplify = TRUE) [1] 0.2 0.2 -- snip -- One could also use the powerTransform() function in the car package, which in this context transforms towards *multi*normality: -- snip -- > library(car) Loading required package: carData > powerTransform(dd + 1) Estimated transformation parameters Y1Y2 0.1740200 0.2089925 I hope this helps, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://www.john-fox.ca/ On 2023-07-08 12:47 p.m., Bert Gunter wrote: Caution: This email may have originated from outside the organization. Please exercise additional caution with any links and attachments. No, I'm afraid I'm wrong. Something went wrong with my R session and gave me incorrect answers. After restarting, I continued to get the same error as you did with my supposed "fix." So just ignore what I said and sorry for the noise. -- Bert On Sat, Jul 8, 2023 at 8:28 AM Bert Gunter wrote: Try this for your function: BoxCoxLambda <- function(z){ y <- z b <- boxcox(y + 1 ~ 1,lambda = seq(-5, 5, length.out = 61), plotit = FALSE) b$x[which.max(b$y)]# best lambda } ***I think*** (corrections and clarification strongly welcomed!) that `~` (the formula function) is looking for 'z' in the GlobalEnv, the caller of apply(), and not finding it. It finds 'y' here explicitly in the BoxCoxLambda environment. Cheers, Bert On Sat, Jul 8, 2023 at 4:28 AM Ron Crump via R-help wrote: Hi, Firstly, apologies as I have posted this on community.rstudio.com too. I want to optimise a Box-Cox transformation on columns of a matrix (ie, a unique lambda for each column). So I wrote a function that includes the call to MASS::boxcox in order that it can be applied to each column easily. Except that I'm getting an error when calling the function. If I just extract a column of the matrix and run the code not in the function, it works. If I call the function either with an extracted column (ie dd1 in the reprex below) or in a call to apply I get an error (see the reprex below). I'm sure I'm doing something silly, but I can't see what it is. Any help appreciated. library(MASS) # Find optimised Lambda for Boc-Cox transformation BoxCoxLambda <- function(z){ b <- boxcox(lm(z+1 ~ 1), lambda = seq(-5, 5, length.out = 61), plotit = FALSE) b$x[which.max(b$y)]# best lambda } mrow <- 500 mcol <- 2 set.seed(12345) dd <- matrix(rgamma(mrow*mcol, shape = 2, scale = 5), nrow = mrow, ncol = mcol) # Try it not using the BoxCoxLambda function: dd1 <- dd[,1] # 1st column of dd bb <- boxcox(lm(dd1+1 ~ 1), lambda = seq(-5, 5, length.out = 101), plotit = FALSE) print(paste0("1st column's lambda is ", bb$x[which.max(bb$y)])) #> [1] "1st column's lambda is 0.2" # Calculate lambda for each column of dd lambdas <- apply(dd, 2, BoxCoxLambda, simplify = TRUE) #> Error in eval(predvars, data, env): object 'z' not found Created on 2023-07-08 with reprex v2.0.2 Thanks for your time and help. Ron [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] loess plotting problem
Dear , On 2023-03-23 11:08 a.m., Anupam Tyagi wrote: Thanks, John. However, loess.smooth() is producing a very different curve compared to the one that results from applying predict() on a loess(). I am guessing they are using different defaults. Correct? No need to guess. Just look at the help pages ?loess and ?loess.smooth. If you don't like the default for loess.smooth(), just specify the arguments you want. Best, John On Thu, 23 Mar 2023 at 20:20, John Fox <mailto:j...@mcmaster.ca>> wrote: Dear Anupam Tyagi, You didn't include your data, so it's not possible to see exactly what happened, but I think that you misunderstand the object that loess() returns. It returns a "loess" object with several components, including the original data in x and y. So if pass the object to lines(), you'll simply connect the points, and if x isn't sorted, the points won't be in order. Try, e.g., plot(speed ~ dist, data=cars) m <- loess(speed ~ dist, data=cars) names(m) lines(m) You'd do better to use loess.smooth(), which is intended for adding a loess regression to a scatterplot; for example, plot(speed ~ dist, data=cars) with(cars, lines(loess.smooth(dist, speed))) Other points: You don't have to load the stats package which is available by default when you start R. It's best to avoid attach(), the use of which can cause confusion. I hope this helps, John -- * preferred email: john.david@proton.me <mailto:john.david@proton.me> John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://www.john-fox.ca/ <https://www.john-fox.ca/> On 2023-03-23 10:18 a.m., Anupam Tyagi wrote: > For some reason the following code is not plotting as I want it to. I want > to plot a "loess" line plotted over a scatter plot. I get a jumble, with > lines connecting all the points. I had a similar problem with "lowess". I > solved that by dropping "NA" rows from the data columns. Please help. > > library(stats) > attach(gini_pci_wdi_narm) > plot(ny_gnp_pcap_pp_kd, si_pov_gini) > lines(loess(si_pov_gini ~ ny_gnp_pcap_pp_kd, gini_pci_wdi_narm)) > detach(gini_pci_wdi_narm) > -- Anupam. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] loess plotting problem
Dear Anupam Tyagi, You didn't include your data, so it's not possible to see exactly what happened, but I think that you misunderstand the object that loess() returns. It returns a "loess" object with several components, including the original data in x and y. So if pass the object to lines(), you'll simply connect the points, and if x isn't sorted, the points won't be in order. Try, e.g., plot(speed ~ dist, data=cars) m <- loess(speed ~ dist, data=cars) names(m) lines(m) You'd do better to use loess.smooth(), which is intended for adding a loess regression to a scatterplot; for example, plot(speed ~ dist, data=cars) with(cars, lines(loess.smooth(dist, speed))) Other points: You don't have to load the stats package which is available by default when you start R. It's best to avoid attach(), the use of which can cause confusion. I hope this helps, John -- * preferred email: john.david@proton.me John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://www.john-fox.ca/ On 2023-03-23 10:18 a.m., Anupam Tyagi wrote: For some reason the following code is not plotting as I want it to. I want to plot a "loess" line plotted over a scatter plot. I get a jumble, with lines connecting all the points. I had a similar problem with "lowess". I solved that by dropping "NA" rows from the data columns. Please help. library(stats) attach(gini_pci_wdi_narm) plot(ny_gnp_pcap_pp_kd, si_pov_gini) lines(loess(si_pov_gini ~ ny_gnp_pcap_pp_kd, gini_pci_wdi_narm)) detach(gini_pci_wdi_narm) __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Good Will Legal Question
Dear Timothy, On 2023-03-21 1:38 p.m., Ebert,Timothy Aaron wrote: My guess: It I clear from the link that they can use the R logo for commercial purposes. The issue is what to do about the "appropriate credit" and "link to the license." How would I do that on a hoodie? Would they need a web address or something? That's a good question, and one that I missed -- the implicit focus is on using the logo, e.g., in software. With the caveat that I'm not speaking for the R Foundation, I think that it would be sufficient to provide credit and a link to the license on the webpage that sells the hoodie. FWIW, I (and I expect you) have seen many t-shirts, etc., with R logos, some from companies, and I even have a few. I doubt that anyone will care. Best, John -Original Message----- From: R-help On Behalf Of John Fox Sent: Tuesday, March 21, 2023 1:19 PM To: Coding Hoodies Cc: r-help@r-project.org Subject: Re: [R] Good Will Legal Question [External Email] Dear Arid Sweeting, R-help is probably not the place to ask this question, although perhaps since you're seeking moral advice, people might want to say something. I would normally expect to see a query like this addressed to the R website webmasters, of which I'm one -- with the caveat that the R Foundation doesn't give legal advice. Just to be sure, you say that you read the rules for use of the R logo, so I assume that you've seen <https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.r-project.org%2Flogo%2F&data=05%7C01%7Ctebert%40ufl.edu%7C99f01774c9f5452bd99a08db2a31ec23%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C638150166126816193%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=jNvmCKITcZFcmqiRqkjqZnJVY3TYuD3wu3Mp0zhSHPs%3D&reserved=0>, which seems entirely clear to me. I think that it's safe to say that if the R Foundation wanted to limit commercial use of the R logo, it wouldn't have released it under the CC-BY-SA 4.0 license. I'm not sure what moral issues concern you. I hope this helps, John John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsocialsciences.mcmaster.ca%2Fjfox%2F&data=05%7C01%7Ctebert%40ufl.edu%7C99f01774c9f5452bd99a08db2a31ec23%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C638150166126816193%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=iLeUGFcyjk3kYNi2v8fV1jgc9M9OVdWYv9nJeI1G7Q4%3D&reserved=0 On 2023-03-21 6:18 a.m., Coding Hoodies wrote: Hi R Team!, We are opening a new start up soon, codinghoodies.com, we want to make coders feel stylish. Out of goodwill I wanted to ask you formally if I can have permission to use the standard R logo on the front of hoodies to sell? I have read your rules but wanted to ask as I feel a moral right to email you asking to show support and respect for the R project. If it makes it easier I could build send a picture of the hoodie with the logo on to you to see if this is acceptable. Arid Sweeting __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat .ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7Ctebert%40ufl.edu %7C99f01774c9f5452bd99a08db2a31ec23%7C0d4da0f84a314d76ace60a62331e1b84 %7C0%7C0%7C638150166126972400%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw MDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sda ta=p2ffNKEh6intBdGjjtr6jaaaRcdtiBw4iMI1CL6K9Xg%3D&reserved=0 PLEASE do read the posting guide https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r -project.org%2Fposting-guide.html&data=05%7C01%7Ctebert%40ufl.edu%7C99 f01774c9f5452bd99a08db2a31ec23%7C0d4da0f84a314d76ace60a62331e1b84%7C0% 7C0%7C638150166126972400%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiL CJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=bg OZVdlLFSw3mbQGmF0OLrMOVUcYonH9wHMN3Y2TqDM%3D&reserved=0 and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7Ctebert%40ufl.edu%7C99f01774c9f5452bd99a08db2a31ec23%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C638150166126972400%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=p2ffNKEh6intBdGjjtr6jaaaRcdtiBw4iMI1CL6K9Xg%3D&reserved=0 PLEASE do read the posting guide https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=
Re: [R] DOUBT
Dear Nandiniraj, Please cc r-help in your emails so that others can see what happened with your problem. You don't provide enough information to know what exactly is the source of your problem -- you're more likely to get effective help if you provide a minimal reproducible example of the problem -- but it's a good guess that the variable (HHsize or perhaps some other variable) isn't in the newdata data frame. Best, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://www.john-fox.ca/ On 2023-03-21 1:24 p.m., Nandini raj wrote: I removed space even though it is showing error. I.e Variable not found Nandiniraj On Tue, Mar 21, 2023, 10:36 PM John Fox <mailto:j...@mcmaster.ca>> wrote: Dear Nandini raj, You have a space in the variable name "HH size". I hope this helps, John John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ <https://socialsciences.mcmaster.ca/jfox/> On 2023-03-20 1:16 p.m., Nandini raj wrote: > Respected sir/madam > can you please suggest what is an unexpected symbol in the below code for > running a multinomial logistic regression > > model <- multinom(adoption ~ age + education + HH size + landholding + > Farmincome + nonfarmincome + creditaccesibility + LHI, data=newdata) > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org <mailto:R-help@r-project.org> mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help <https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html <http://www.R-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Good Will Legal Question
Dear Arid Sweeting, R-help is probably not the place to ask this question, although perhaps since you're seeking moral advice, people might want to say something. I would normally expect to see a query like this addressed to the R website webmasters, of which I'm one -- with the caveat that the R Foundation doesn't give legal advice. Just to be sure, you say that you read the rules for use of the R logo, so I assume that you've seen <https://www.r-project.org/logo/>, which seems entirely clear to me. I think that it's safe to say that if the R Foundation wanted to limit commercial use of the R logo, it wouldn't have released it under the CC-BY-SA 4.0 license. I'm not sure what moral issues concern you. I hope this helps, John John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 2023-03-21 6:18 a.m., Coding Hoodies wrote: Hi R Team!, We are opening a new start up soon, codinghoodies.com, we want to make coders feel stylish. Out of goodwill I wanted to ask you formally if I can have permission to use the standard R logo on the front of hoodies to sell? I have read your rules but wanted to ask as I feel a moral right to email you asking to show support and respect for the R project. If it makes it easier I could build send a picture of the hoodie with the logo on to you to see if this is acceptable. Arid Sweeting __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] DOUBT
Dear Nandini raj, You have a space in the variable name "HH size". I hope this helps, John John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 2023-03-20 1:16 p.m., Nandini raj wrote: Respected sir/madam can you please suggest what is an unexpected symbol in the below code for running a multinomial logistic regression model <- multinom(adoption ~ age + education + HH size + landholding + Farmincome + nonfarmincome + creditaccesibility + LHI, data=newdata) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] tcl tk: set the position button
Dear Rodrigo, Try tkwm.geometry(win1, "-0+0"), which should position win1 at the top right. I hope this helps, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 2023-03-12 8:41 p.m., Rodrigo Badilla wrote: Hi all, I am using tcltk2 library to show buttons and messages. Everything work fine but I would like set the tk2button to the right of my screen, by default it display at the left of my screen. my script example: library(tcltk2) win1 <- tktoplevel() butOK <- tk2button(win1, text = "TEST", width = 77) tkgrid(butOK) Thanks in advance Saludos Rodrigo -- Este correo electrónico ha sido analizado en busca de virus por el software antivirus de Avast. www.avast.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MFA variables graph, filtered by separate.analyses
Dear gavin, I think that it's likely that Jim meant the hetcor() function in the polycor package. Best, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 2023-02-21 5:42 p.m., gavin duley wrote: Hi Jim, On Tue, 21 Feb 2023 at 22:17, Jim Lemon wrote: I can't work through this right now, but I would start by looking at the 'hetcor' package to get the correlations, or if they are already in the return object, build a plot from these. Thanks for the suggestion. I'll read up on the 'hetcor' package. Thanks, gavin, __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] unexpected 'else' in " else"
Dear Jinsong, When you enter these code lines at the R command prompt, the interpreter evaluates an expression when it's syntactically complete, which occurs before it sees the else clause. The interpreter can't read your mind and know that an else clause will be entered on the next line. When the code lines are in a function, the function body is enclosed in braces and so the interpreter sees the else clause. As I believe was already pointed out, you can similarly use braces at the command prompt to signal incompleteness of an expression, as in > {if (FALSE) print(1) + else print(2)} [1] 2 I hope this helps, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 2022-10-21 8:06 a.m., Jinsong Zhao wrote: Thanks a lot! I know the first and third way to correct the error. The second way seems make me know why the code is correct in the function stats::weighted.residuals. On 2022/10/21 17:36, Andrew Simmons wrote: The error comes from the expression not being wrapped with braces. You could change it to if (is.matrix(r)) { r[w != 0, , drop = FALSE] } else r[w != 0] or { if (is.matrix(r)) r[w != 0, , drop = FALSE] else r[w != 0] } or if (is.matrix(r)) r[w != 0, , drop = FALSE] else r[w != 0] On Fri., Oct. 21, 2022, 05:29 Jinsong Zhao, wrote: Hi there, The following code would cause R error: > w <- 1:5 > r <- 1:5 > if (is.matrix(r)) + r[w != 0, , drop = FALSE] > else r[w != 0] Error: unexpected 'else' in " else" However, the code: if (is.matrix(r)) r[w != 0, , drop = FALSE] else r[w != 0] is extracted from stats::weighted.residuals. My question is why the code in the function does not cause error? Best, Jinsong __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html <http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to obtain a consistent estimator with a binary response model with endogenous explanatory variables?
Dear John (again), I was surprised that you were unable to find an existing R function that estimates a probit model by IV and so I tried a Google search for "probit instrumental variables R", which turned up the ivprobit package as the first hit. That package is also mentioned in the Econometrics CRAN taskview <https://cran.r-project.org/web/views/Econometrics.html>. The model fit by the ivprobit() function in the ivprobit package is a bit more general than the one in Wikipedia (at least by my quick reading of both) in that it permits more than one endogenous explanatory variable. Best, John John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 2022-09-28 3:47 p.m., John Fox wrote: Dear John, The Wikipedia page to which you refer appears to have all the information you need to write your own straightfoward R program for the 2SLS or ML estimator for a probit model. I hope this helps, John On 2022-09-28 8:50 a.m., Sun, John wrote: Dear All, I stumbled on a Wikipedia page describing the Two stage least-squares with a probit model with implementing a consistent estimator in binary variable regression. How do I implement this method in R? It is related to instrumental variables estimator. I looked in ivreg and plm package and found nothing I think related. https://en.wikipedia.org/wiki/Binary_response_model_with_continuous_endogenous_explanatory_variables Best regards, John __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to obtain a consistent estimator with a binary response model with endogenous explanatory variables?
Dear John, The Wikipedia page to which you refer appears to have all the information you need to write your own straightfoward R program for the 2SLS or ML estimator for a probit model. I hope this helps, John On 2022-09-28 8:50 a.m., Sun, John wrote: Dear All, I stumbled on a Wikipedia page describing the Two stage least-squares with a probit model with implementing a consistent estimator in binary variable regression. How do I implement this method in R? It is related to instrumental variables estimator. I looked in ivreg and plm package and found nothing I think related. https://en.wikipedia.org/wiki/Binary_response_model_with_continuous_endogenous_explanatory_variables Best regards, John __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Correlate
this helps, John Thank you, __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat .ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7Ctebert%40ufl .edu%7C871d5009dd3c455f398f08da84585e4a%7C0d4da0f84a314d76ace60a62331e 1b84%7C0%7C0%7C637967812337328788%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4w LjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C &sdata=3iAfMs1QzQARKF3lqUI8s43PX4IIkgEuQ9PUDyUtpqY%3D&reserved =0 PLEASE do read the posting guide https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r -project.org%2Fposting-guide.html&data=05%7C01%7Ctebert%40ufl.edu% 7C871d5009dd3c455f398f08da84585e4a%7C0d4da0f84a314d76ace60a62331e1b84% 7C0%7C0%7C637967812337328788%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwM DAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C& sdata=v3IEonnPgg1xTKUzLK4rJc3cfMFxw5p%2FW6puha5CFz0%3D&reserved=0 and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat .ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7Ctebert%40ufl .edu%7C871d5009dd3c455f398f08da84585e4a%7C0d4da0f84a314d76ace60a62331e 1b84%7C0%7C0%7C637967812337328788%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4w LjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C &sdata=3iAfMs1QzQARKF3lqUI8s43PX4IIkgEuQ9PUDyUtpqY%3D&reserved =0 PLEASE do read the posting guide https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r -project.org%2Fposting-guide.html&data=05%7C01%7Ctebert%40ufl.edu% 7C871d5009dd3c455f398f08da84585e4a%7C0d4da0f84a314d76ace60a62331e1b84% 7C0%7C0%7C637967812337328788%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwM DAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C& sdata=v3IEonnPgg1xTKUzLK4rJc3cfMFxw5p%2FW6puha5CFz0%3D&reserved=0 and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to add count to pie chart legend
Dear Jim and Ana, Why not skip the legend and put the counts in the labels? with(df, pie(n, paste0(V1, " (", n, ")"), col=c(3, 2), main="Yes and No", radius=1)) Best, John On 2022-08-15 9:43 p.m., Jim Lemon wrote: Hi Ana, A lot of work for a little pie. df<-read.table(text="V1 n Yes 8 No 14", header=TRUE, stringsAsFactors=FALSE) par(mar=c(5,4,4,4)) pie(df$n,df$V1,col=c(3,2),main="Yes and No", xlab="",ylab="",radius=1) legend(0.75,-0.8,paste(df$V1,df$n),fill=c(3,2), xpd=TRUE) Jim On Tue, Aug 16, 2022 at 1:59 AM Ana Marija wrote: Hi All, I have df like this: df# A tibble: 2 × 4 V1n perc labels 1 Yes 8 0.364 36% 2 No 14 0.636 64% I am making pie chart like this: library(ggplot2) ggplot(df, aes(x = "", y = perc, fill = V1)) + geom_col(color = "black") + geom_label(aes(label = labels), position = position_stack(vjust = 0.5), show.legend = FALSE) + guides(fill = guide_legend(title = "Answer")) + coord_polar(theta = "y") + theme_void() How would I add in the legend beside Answer "Yes" count 8 (just number 8) and beside "No" count 14? Thanks Ana [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Odd behavior of a function within apply
Dear Erin, The problem is that the data frame gets coerced to a character matrix, and the only column with "" entries is the 9th (the second one you supplied): as.matrix(test1.df) X1_1_HZP1 X1_1_HBM1_mon X1_1_HBM1_yr 1 "48160" "December""2014" 2 "48198" "June""2018" 3 "80027" "August" "2016" 4 "48161" ""NA 5 NA""NA 6 "48911" "August" "1985" 7 NA"April" "2019" 8 "48197" "February""1993" 9 "48021" ""NA 10 "11355" "December""1990" (Here, test1.df only contains the three columns you provided.) A solution is to use sapply: > sapply(test1.df, count1a) X1_1_HZP1 X1_1_HBM1_mon X1_1_HBM1_yr 2 3 3 I hope this helps, John On 2022-08-08 1:22 p.m., Erin Hodgess wrote: Hello! I have the following data.frame dput(test1.df[1:10,8:10]) structure(list(X1_1_HZP1 = c(48160L, 48198L, 80027L, 48161L, NA, 48911L, NA, 48197L, 48021L, 11355L), X1_1_HBM1_mon = c("December", "June", "August", "", "", "August", "April", "February", "", "December"), X1_1_HBM1_yr = c(2014L, 2018L, 2016L, NA, NA, 1985L, 2019L, 1993L, NA, 1990L)), row.names = c(NA, 10L), class = "data.frame") And the following function: dput(count1a) function (x) { if (typeof(x) == "integer") y <- sum(is.na(x)) if (typeof(x) == "character") y <- sum(x == "") return(y) } When I use the apply function with count1a, I get the following: apply(test1.df[1:10,8:10],2,count1a) X1_1_HZP1 X1_1_HBM1_mon X1_1_HBM1_yr NA 3NA However, when I do use columns 8 and 10, I get the correct response: apply(test1.df[1:10,c(8,10)],2,count1a) X1_1_HZP1 X1_1_HBM1_yr 23 I am really baffled. If I use count1a on a single column, it works fine. Any suggestions much appreciated. Thanks, Sincerely, Erin Erin Hodgess, PhD mailto: erinm.hodg...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Predicted values from glm() when linear predictor is NA.
Dear Jeff, On 2022-07-28 11:12 a.m., Jeff Newmiller wrote: No, in this case I think I needed the "obvious" breakdown. Still digesting, though... I would prefer that if an arbitrary selection had been made that it be explicit .. the NA should be replaced with zero if the singular.ok argument is TRUE, rather than making that interpretation in predict.glm. That's one way to think about, but another is that the model matrix X has 10 columns but is of rank 9. Thus 9 basis vectors are needed to span the column space of X, and a simple way to provide a basis is to eliminate a redundant column, hence the NA. The fitted values y-hat in a linear model are the orthogonal projection of y onto the space spanned by the columns of X, and are thus independent of the basis chosen. A GLM is a little more complicated, but it's still the column space of X that's important. Best, John On July 28, 2022 5:45:35 AM PDT, John Fox wrote: Dear Jeff, On 2022-07-28 1:31 a.m., Jeff Newmiller wrote: But "disappearing" is not what NA is supposed to do normally. Why is it being treated that way here? NA has a different meaning here than in data. By default, in glm() the argument singular.ok is TRUE, and so estimates are provided even when there are singularities, and even though the singularities are resolved arbitrarily. In this model, the columns of the model matrix labelled LifestageL1 and TrtTime:LifestageL1 are perfectly collinear -- the second is 12 times the first (both have 0s in the same rows and either 1 or 12 in three of the rows) -- and thus both can't be estimated simultaneously, but the model can be estimated by eliminating one or the other (effectively setting its coefficient to 0), or by taking any linear combination of the two regressors (i.e., using any regressor with 0s and some other value). The fitted values under the model are invariant with respect to this arbitrary choice. My apologies if I'm stating the obvious and misunderstand your objection. Best, John On July 27, 2022 7:04:20 PM PDT, John Fox wrote: Dear Rolf, The coefficient of TrtTime:LifestageL1 isn't estimable (as you explain) and by setting it to NA, glm() effectively removes it from the model. An equivalent model is therefore fit2 <- glm(cbind(Dead,Alive) ~ TrtTime + Lifestage + + I((Lifestage == "Egg + L1")*TrtTime) + + I((Lifestage == "L1 + L2")*TrtTime) + + I((Lifestage == "L3")*TrtTime), + family=binomial, data=demoDat) Warning message: glm.fit: fitted probabilities numerically 0 or 1 occurred cbind(coef(fit, complete=FALSE), coef(fit2)) [,1] [,2] (Intercept)-0.91718302 -0.91718302 TrtTime 0.88846195 0.88846195 LifestageEgg + L1 -45.36420974 -45.36420974 LifestageL114.27570572 14.27570572 LifestageL1 + L2 -0.30332697 -0.30332697 LifestageL3-3.58672631 -3.58672631 TrtTime:LifestageEgg + L1 8.10482459 8.10482459 TrtTime:LifestageL1 + L20.05662651 0.05662651 TrtTime:LifestageL3 1.66743472 1.66743472 There is no problem computing fitted values for the model, specified either way. That the fitted values when Lifestage == "L1" all round to 1 on the probability scale is coincidental -- that is, a consequence of the data. I hope this helps, John On 2022-07-27 8:26 p.m., Rolf Turner wrote: I have a data frame with a numeric ("TrtTime") and a categorical ("Lifestage") predictor. Level "L1" of Lifestage occurs only with a single value of TrtTime, explicitly 12, whence it is not possible to estimate a TrtTime "slope" when Lifestage is "L1". Indeed, when I fitted the model fit <- glm(cbind(Dead,Alive) ~ TrtTime*Lifestage, family=binomial, data=demoDat) I got: as.matrix(coef(fit)) [,1] (Intercept)-0.91718302 TrtTime 0.88846195 LifestageEgg + L1 -45.36420974 LifestageL114.27570572 LifestageL1 + L2 -0.30332697 LifestageL3-3.58672631 TrtTime:LifestageEgg + L1 8.10482459 TrtTime:LifestageL1 NA TrtTime:LifestageL1 + L20.05662651 TrtTime:LifestageL3 1.66743472 That is, TrtTime:LifestageL1 is NA, as expected. I would have thought that fitted or predicted values corresponding to Lifestage = "L1" would thereby be NA, but this is not the case: predict(fit)[demoDat$Lifestage=="L1"] 26 65 131 24.02007 24.02007 24.02007 fitted(fit)[demoDat$Lifestage=="L1"] 26 65 131 1 1 1 That is, the predicted values on the scale of the linear predictor are large and positive, rather than being NA. What this amounts to, it see
Re: [R] Predicted values from glm() when linear predictor is NA.
Dear Jeff, On 2022-07-28 1:31 a.m., Jeff Newmiller wrote: But "disappearing" is not what NA is supposed to do normally. Why is it being treated that way here? NA has a different meaning here than in data. By default, in glm() the argument singular.ok is TRUE, and so estimates are provided even when there are singularities, and even though the singularities are resolved arbitrarily. In this model, the columns of the model matrix labelled LifestageL1 and TrtTime:LifestageL1 are perfectly collinear -- the second is 12 times the first (both have 0s in the same rows and either 1 or 12 in three of the rows) -- and thus both can't be estimated simultaneously, but the model can be estimated by eliminating one or the other (effectively setting its coefficient to 0), or by taking any linear combination of the two regressors (i.e., using any regressor with 0s and some other value). The fitted values under the model are invariant with respect to this arbitrary choice. My apologies if I'm stating the obvious and misunderstand your objection. Best, John On July 27, 2022 7:04:20 PM PDT, John Fox wrote: Dear Rolf, The coefficient of TrtTime:LifestageL1 isn't estimable (as you explain) and by setting it to NA, glm() effectively removes it from the model. An equivalent model is therefore fit2 <- glm(cbind(Dead,Alive) ~ TrtTime + Lifestage + + I((Lifestage == "Egg + L1")*TrtTime) + + I((Lifestage == "L1 + L2")*TrtTime) + + I((Lifestage == "L3")*TrtTime), + family=binomial, data=demoDat) Warning message: glm.fit: fitted probabilities numerically 0 or 1 occurred cbind(coef(fit, complete=FALSE), coef(fit2)) [,1] [,2] (Intercept)-0.91718302 -0.91718302 TrtTime 0.88846195 0.88846195 LifestageEgg + L1 -45.36420974 -45.36420974 LifestageL114.27570572 14.27570572 LifestageL1 + L2 -0.30332697 -0.30332697 LifestageL3-3.58672631 -3.58672631 TrtTime:LifestageEgg + L1 8.10482459 8.10482459 TrtTime:LifestageL1 + L20.05662651 0.05662651 TrtTime:LifestageL3 1.66743472 1.66743472 There is no problem computing fitted values for the model, specified either way. That the fitted values when Lifestage == "L1" all round to 1 on the probability scale is coincidental -- that is, a consequence of the data. I hope this helps, John On 2022-07-27 8:26 p.m., Rolf Turner wrote: I have a data frame with a numeric ("TrtTime") and a categorical ("Lifestage") predictor. Level "L1" of Lifestage occurs only with a single value of TrtTime, explicitly 12, whence it is not possible to estimate a TrtTime "slope" when Lifestage is "L1". Indeed, when I fitted the model fit <- glm(cbind(Dead,Alive) ~ TrtTime*Lifestage, family=binomial, data=demoDat) I got: as.matrix(coef(fit)) [,1] (Intercept)-0.91718302 TrtTime 0.88846195 LifestageEgg + L1 -45.36420974 LifestageL114.27570572 LifestageL1 + L2 -0.30332697 LifestageL3-3.58672631 TrtTime:LifestageEgg + L1 8.10482459 TrtTime:LifestageL1 NA TrtTime:LifestageL1 + L20.05662651 TrtTime:LifestageL3 1.66743472 That is, TrtTime:LifestageL1 is NA, as expected. I would have thought that fitted or predicted values corresponding to Lifestage = "L1" would thereby be NA, but this is not the case: predict(fit)[demoDat$Lifestage=="L1"] 26 65 131 24.02007 24.02007 24.02007 fitted(fit)[demoDat$Lifestage=="L1"] 26 65 131 1 1 1 That is, the predicted values on the scale of the linear predictor are large and positive, rather than being NA. What this amounts to, it seems to me, is saying that if the linear predictor in a Binomial glm is NA, then "success" is a certainty. This strikes me as being a dubious proposition. My gut feeling is that misleading results could be produced. Can anyone explain to me a rationale for this behaviour pattern? Is there some justification for it that I am not currently seeing? Any other comments? (Please omit comments to the effect of "You are as thick as two short planks!". :-) ) I have attached the example data set in a file "demoDat.txt", should anyone want to experiment with it. The file was created using dput() so you should access it (if you wish to do so) via something like demoDat <- dget("demoDat.txt") Thanks for any enlightenment. cheers, Rolf Turner __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the pos
Re: [R] Predicted values from glm() when linear predictor is NA.
Dear Rolf, The coefficient of TrtTime:LifestageL1 isn't estimable (as you explain) and by setting it to NA, glm() effectively removes it from the model. An equivalent model is therefore > fit2 <- glm(cbind(Dead,Alive) ~ TrtTime + Lifestage + + I((Lifestage == "Egg + L1")*TrtTime) + + I((Lifestage == "L1 + L2")*TrtTime) + + I((Lifestage == "L3")*TrtTime), + family=binomial, data=demoDat) Warning message: glm.fit: fitted probabilities numerically 0 or 1 occurred > cbind(coef(fit, complete=FALSE), coef(fit2)) [,1] [,2] (Intercept)-0.91718302 -0.91718302 TrtTime 0.88846195 0.88846195 LifestageEgg + L1 -45.36420974 -45.36420974 LifestageL114.27570572 14.27570572 LifestageL1 + L2 -0.30332697 -0.30332697 LifestageL3-3.58672631 -3.58672631 TrtTime:LifestageEgg + L1 8.10482459 8.10482459 TrtTime:LifestageL1 + L20.05662651 0.05662651 TrtTime:LifestageL3 1.66743472 1.66743472 There is no problem computing fitted values for the model, specified either way. That the fitted values when Lifestage == "L1" all round to 1 on the probability scale is coincidental -- that is, a consequence of the data. I hope this helps, John On 2022-07-27 8:26 p.m., Rolf Turner wrote: I have a data frame with a numeric ("TrtTime") and a categorical ("Lifestage") predictor. Level "L1" of Lifestage occurs only with a single value of TrtTime, explicitly 12, whence it is not possible to estimate a TrtTime "slope" when Lifestage is "L1". Indeed, when I fitted the model fit <- glm(cbind(Dead,Alive) ~ TrtTime*Lifestage, family=binomial, data=demoDat) I got: as.matrix(coef(fit)) [,1] (Intercept)-0.91718302 TrtTime 0.88846195 LifestageEgg + L1 -45.36420974 LifestageL114.27570572 LifestageL1 + L2 -0.30332697 LifestageL3-3.58672631 TrtTime:LifestageEgg + L1 8.10482459 TrtTime:LifestageL1 NA TrtTime:LifestageL1 + L20.05662651 TrtTime:LifestageL3 1.66743472 That is, TrtTime:LifestageL1 is NA, as expected. I would have thought that fitted or predicted values corresponding to Lifestage = "L1" would thereby be NA, but this is not the case: predict(fit)[demoDat$Lifestage=="L1"] 26 65 131 24.02007 24.02007 24.02007 fitted(fit)[demoDat$Lifestage=="L1"] 26 65 131 1 1 1 That is, the predicted values on the scale of the linear predictor are large and positive, rather than being NA. What this amounts to, it seems to me, is saying that if the linear predictor in a Binomial glm is NA, then "success" is a certainty. This strikes me as being a dubious proposition. My gut feeling is that misleading results could be produced. Can anyone explain to me a rationale for this behaviour pattern? Is there some justification for it that I am not currently seeing? Any other comments? (Please omit comments to the effect of "You are as thick as two short planks!". :-) ) I have attached the example data set in a file "demoDat.txt", should anyone want to experiment with it. The file was created using dput() so you should access it (if you wish to do so) via something like demoDat <- dget("demoDat.txt") Thanks for any enlightenment. cheers, Rolf Turner __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grep
Dear Steven, Beyond ?regex, the Wikipedia article on regular expressions <https://en.wikipedia.org/wiki/Regular_expression> is quite helpful and not too long. I hope this helps, John On 2022-07-10 9:43 p.m., Steven T. Yen wrote: Thanks Jeff. It works. If there is a good reference I should read (besides ? grep) I's be glad to have it. On 7/11/2022 9:30 AM, Jeff Newmiller wrote: grep( "^(z|x)\\.", jj, value = TRUE ) or grep( r"(^(z|x)\.)", jj, value = TRUE ) On July 10, 2022 6:08:45 PM PDT, "Steven T. Yen" wrote: Dear, Below, jj contains character strings starting with “z.” and “x.”. I want to grep all that contain either “z.” or “x.”. I had to grep “z.” and “x.” separately and then tack the result together. Is there a convenient grep option that would grep strings with either “z.” or “x.”. Thank you! jj<-names(v$est); jj [1] "z.one" "z.liberal" "z.conserv" "z.dem" "z.rep" "z.realinc" [7] "x.one" "x.liberal" "x.conserv" "x.dem" "x.rep" "x.realinc" [13] "mu1_1" "mu2_1" "rho" j1<-grep("z.",jj,value=TRUE); j1 [1] "z.one" "z.liberal" "z.conserv" "z.dem" "z.rep" "z.realinc" j2<-grep("x.",jj,value=TRUE); j2 [1] "x.one" "x.liberal" "x.conserv" "x.dem" "x.rep" "x.realinc" j<-c(j1,j2); j [1] "z.one" "z.liberal" "z.conserv" "z.dem" "z.rep" "z.realinc" [7] "x.one" "x.liberal" "x.conserv" "x.dem" "x.rep" "x.realinc" __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R for linear algebra
Dear Avi, On 2022-06-25 2:09 p.m., Avi Gross via R-help wrote: John, I am not in any way disparaging the matlib package and it seems well-built for the limited purpose of teaching Linear Algebra rather than R. It is probably a better answer to a question about how to teach linear algebra while making some more complex tasks doable. I recall the frustration of multiplying matrices by hand as well as other operations, necessarily on smaller matrices. My comments were more along the lines of the charter of this group which seems far narrower. Yes, people can ask for suggestions for a package that does something that may interest them but getting help on any one of thousands of such packages here would get overwhelming. From what you said though, others looking for a package to use for real projects might well beware as it may indeed not be particularly fast or in some cases perhaps not as flexible. As would be clear from the documentation -- e.g., from the Details section of ?Inverse: "The method is purely didactic: The identity matrix, I, is appended to X, giving [X | I]. Applying Gaussian elimination gives [I | X^{-1}], and the portion corresponding to X^{-1} is returned." Best, John -Original Message----- From: John Fox To: Avi Gross Cc: r-help@r-project.org Sent: Sat, Jun 25, 2022 1:34 pm Subject: Re: [R] R for linear algebra Dear Avi, The purpose of the matlib package is to *teach* linear algebra and related topics, not to replace or even compete with similar functionality in base R. Consider, e.g., the following example for matlib::Inverse(), which computes matrix inverses by Gaussian elimination (I've elided most of the steps): > example("Inverse") Invers> A <- matrix(c(2, 1, -1, Invers+ -3, -1, 2, Invers+ -2, 1, 2), 3, 3, byrow=TRUE) Invers> Inverse(A) [,1] [,2] [,3] [1,] 4 3 -1 [2,] -2 -2 1 [3,] 5 4 -1 Invers> Inverse(A, verbose=TRUE, fractions=TRUE) Initial matrix: [,1] [,2] [,3] [,4] [,5] [,6] [1,] 2 1 -1 1 0 0 [2,] -3 -1 2 0 1 0 [3,] -2 1 2 0 0 1 row: 1 exchange rows 1 and 2 [,1] [,2] [,3] [,4] [,5] [,6] [1,] -3 -1 2 0 1 0 [2,] 2 1 -1 1 0 0 [3,] -2 1 2 0 0 1 multiply row 1 by -1/3 [,1] [,2] [,3] [,4] [,5] [,6] [1,] 1 1/3 -2/3 0 -1/3 0 [2,] 2 1 -1 1 0 0 [3,] -2 1 2 0 0 1 multiply row 1 by 2 and subtract from row 2 [,1] [,2] [,3] [,4] [,5] [,6] [1,] 1 1/3 -2/3 0 -1/3 0 [2,] 0 1/3 1/3 1 2/3 0 [3,] -2 1 2 0 0 1 . . . multiply row 3 by 2/5 and subtract from row 2 [,1] [,2] [,3] [,4] [,5] [,6] [1,] 1 0 0 4 3 -1 [2,] 0 1 0 -2 -2 1 [3,] 0 0 1 5 4 -1 [,1] [,2] [,3] [1,] 4 3 -1 [2,] -2 -2 1 [3,] 5 4 -1 And similarly for the other functions in the package. Moreover, the functions in the package are transparently programmed in R rather than calling (usually more efficient but relatively inaccessible) compiled code, e.g., in a BLAS. Best, John On 2022-06-24 9:57 p.m., Avi Gross via R-help wrote: Yes, Michael, packages like matlib will extend the basic support within base R and I was amused at looking at what the package supported that I had not thought about in years! https://www.rdocumentation.org/packages/matlib/versions/0.9.5 However, once you through in packages/modules/libraries and other add-ons, I suggest many languages share such gifts that then allow a serious amount of what we call linear Algebra to be done, albeit with some work. In some ways my review of Linear Algebra in recent years showed me that some things have changed from when I took it as part of a math degree in college but a big change has been in finding so many uses of it now that computers can do complicated things fast and even faster when using vectors and matrices as a way to organize and consolidate operations, sometimes even in parallel. Packages can be nice and especially if they gather together lots of related functions to teach a subject but anyone doing serious work should first make sure they know what is in the base. If your matrix is A, you could load a package like psych to get a trace of A: psych::tr(A) Or install matlib with oodles of dependencies matlib::tr(A) Or without worrying if the user had installed and made available a library, use built-ins like diag() and sum(): sum(diag(A)) And what does matlib::Det(A) gain you most of the time that det(A) does not tell you? A policy of this forum is to point out mostly what is part of standard R and obviously there can be specialized functionality in packages, albeit some functions in that package are unlikely to be taught in a first undergraduate Linear Algebra course. I note matlib is billed as also being for learning M
Re: [R] Obtaining the source code
Dear Cristofer, > stats:::rstandard.lm function (model, infl = lm.influence(model, do.coef = FALSE), sd = sqrt(deviance(model)/df.residual(model)), type = c("sd.1", "predictive"), ...) { type <- match.arg(type) res <- infl$wt.res/switch(type, sd.1 = c(outer(sqrt(1 - infl$hat), sd)), predictive = 1 - infl$hat) res[is.infinite(res)] <- NaN res } More generally, use ::: for an object that's hidden in a package namespace. I hope this helps, John On 2022-06-19 1:23 p.m., Christofer Bogaso wrote: Hi, I am trying to see the source code of rstandard function. I tried below, methods('rstandard') [1] rstandard.glm* rstandard.lm* What do I need to do if I want to see the source code of rstandard.lm*? Thanks for your help. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rbind of multiple data frames by column name, when each data frames can contain different columns
e in error, please forward it to the sender and delete it completely from your computer system. -- Questo messaggio stato analizzato da Libraesva ESG ed risultato non infetto. This message was scanned by Libraesva ESG and is believed to be clean. [[alternative HTML version deleted]] __ R-help@r-project.org<mailto:R-help@r-project.org> mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help<https://urlsand.esvalabs.com/?u=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&e=5a635173&h=06ff70f3&f=y&p=y> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html<https://urlsand.esvalabs.com/?u=http%3A%2F%2Fwww.R-project.org%2Fposting-guide.html&e=5a635173&h=e12f63e8&f=y&p=y> and provide commented, minimal, self-contained, reproducible code. -- Questo messaggio � stato analizzato con Libraesva ESG ed � risultato non infetto AVVISO IMPORTANTE: Questo messaggio di posta elettronica pu� contenere informazioni confidenziali, pertanto � destinato solo a persone autorizzate alla ricezione. I messaggi di posta elettronica per i client di Regione Marche possono contenere informazioni confidenziali e con privilegi legali. Se non si � il destinatario specificato, non leggere, copiare, inoltrare o archiviare questo messaggio. Se si � ricevuto questo messaggio per errore, inoltrarlo al mittente ed eliminarlo completamente dal sistema del proprio computer. Ai sensi dell�art. 6 della DGR n. 1394/2008 si segnala che, in caso di necessit� ed urgenza, la risposta al presente messaggio di posta elettronica pu� essere visionata da persone estranee al destinatario. IMPORTANT NOTICE: This e-mail message is intended to be received only by persons entitled to receive the confidential information it may contain. E-mail messages to clients of Regione Marche may contain information that is confidential and legally privileged. Please do not read, copy, forward, or store this message unless you are an intended recipient of it. If you have received this message in error, please forward it to the sender and delete it completely from your computer system. -- Questo messaggio � stato analizzato con Libraesva ESG ed � risultato non infetto. This message has been checked by Libraesva ESG and is believed to be clean. -- Questo messaggio � stato analizzato con Libraesva ESG ed � risultato non infetto AVVISO IMPORTANTE: Questo messaggio di posta elettronica pu� contenere informazioni confidenziali, pertanto � destinato solo a persone autorizzate alla ricezione. I messaggi di posta elettronica per i client di Regione Marche possono contenere informazioni confidenziali e con privilegi legali. Se non si � il destinatario specificato, non leggere, copiare, inoltrare o archiviare questo messaggio. Se si � ricevuto questo messaggio per errore, inoltrarlo al mittente ed eliminarlo completamente dal sistema del proprio computer. Ai sensi dell�art. 6 della DGR n. 1394/2008 si segnala che, in caso di necessit� ed urgenza, la risposta al presente messaggio di posta elettronica pu� essere visionata da persone estranee al destinatario. IMPORTANT NOTICE: This e-mail message is intended to be received only by persons entitled to receive the confidential information it may contain. E-mail messages to clients of Regione Marche may contain information that is confidential and legally privileged. Please do not read, copy, forward, or store this message unless you are an intended recipient of it. If you have received this message in error, please forward it to the sender and delete it completely from your computer system. -- Questo messaggio stato analizzato da Libraesva ESG ed risultato non infetto. This message was scanned by Libraesva ESG and is believed to be clean. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Placement of legend in base plot()
Dear Helmut, I'm not sure why you're seeing an error, but replacing the last several commands in your code with legend(loc, legend = lgd, x.intersp = 0, title = paste("n =", n.pts[1]), bg = bg) works perfectly fine for me. I suspect that the example you posted differs in some respect from the code that produces the error. I hope this helps, John On 2022-05-31 9:07 a.m., Helmut Schütz wrote: Dear all, I try to figure out where to automatically place the legend in a scatter plot. If there is large variability, points may cover the legend. Hence, I assess in which section the fewest points are. Example: set.seed(27) # for reproducibility n <- 25 slope <- +1 sd <- 10 x <- 1:n mean.x <- mean(x) y <- slope * x + rnorm(n = n, mean = mean.x, sd = sd) mean.y <- mean(y) top <- which(y >= mean.y) bottom <- which(y < mean.y) left <- which(x <= mean.x) right <- which(x > mean.x) n.pts <- data.frame("topleft" = sum(top %in% left), "topright" = sum(top %in% right), "bottomleft" = sum(bottom %in% left), "bottomright" = sum(bottom %in% right)) loc <- names(n.pts)[n.pts == min(n.pts)] if (length(loc) > 1) loc <- loc[1] # arbitrary selection (better approaches?) bg <- "transparent" lgd <- paste("Pretty long legend line number #", 1:3) plot(x, y, type ="n", pch = 19, xlab = "", ylab = "", axes = FALSE, frame.plot = TRUE) abline(h = mean.y, v = mean.x) mtext(text = paste0("top left: n = ", n.pts[1], ", right: n = ", n.pts[2]), side = 3, line = 1) mtext(text = paste0("bottom left: n = ", n.pts[3], ", right: n = ", n.pts[4]), side = 1, line = 1) mtext(text = paste0("bottom: n = ", sum(n.pts[3:4]), ", top = ", sum(n.pts[1:2])), side = 2, line = 1) points(x, y, pch = 19, col = "red", cex = 1.25) print(n.pts); loc if (loc == "topleft") legend("topleft", legend = lgd, x.intersp = 0, title = paste("n =", n.pts[1]), bg = bg) if (loc == "topright") legend("topright", legend = lgd, x.intersp = 0, title = paste("n =", n.pts[2]), bg = bg) if (loc == "bottomleft") legend("bottomleft", legend = lgd, x.intersp = 0, title = paste("n =", n.pts[3]), bg = bg) if (loc == "bottomright") legend("bottomright", legend = lgd, x.intersp = 0, title = paste("n =", n.pts[4]), bg = bg) Unfortunately, one of the keywords in legend() instead of x, y cannot be a variable. Hence, legend(loc, ...) throws an error... Error in match.arg(x, c("bottomright", "bottom", "bottomleft", "left", : 'arg' must be of length 1 ... and I had to resort to conditionally specify all 4. Given. Problems: 1. If there are the same number of points in sections, I select the first though another might lead to fewer overlapping points. Is there a better approach? 2. I know how to get the width/height of the legend box with (..., plot = FALSE) but couldn't figure out how to squeeze it between points where enough space might exist. Best, Helmut -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Complex Survey Data and EFA
Dear Lybrya, I don't have personal experience with it, but to do parallel analyses, you'd have to simulate data according to the sampling design. That shouldn't be too hard but would require custom programming and you may be able to adapt existing code, such as that in the psych package. You already have, in base R and the packages that you reference, what's necessary for computing scree plots and tetrachoric correlations. Here, using an example in ?svyfactanal, is one way to get scree plots based on the correlation matrix with either 1's or communalities on the main diagonal: library(survey) example("factanal") fa <- factanal(~api99+api00+hsg+meals+ell+emer, data=apipop, factors=2) RR <- R <- fa$correlation (u <- fa$uniquenesses) diag(RR) <- 1 - u plot(eigen(R, only.values=TRUE)$values, type="b", ylab=expression(lambda[i]), main="Scree Plot (correlations)") plot(eigen(RR, only.values=TRUE)$values, type="b", ylab=expression(lambda[i]), main="Scree Plot (correlations with communalities)") Here is an example of computing a polychoric correlation based on an example in ?svytable: example("svytable") library(polycor) polychoric(tbl) The example is nonsense in that the levels of stype in the table are out of order -- I show it here just to demonstrate how to do the computation. As well, stype has three levels, but polychoric() computes tetrachoric correlations when both variable are binary. I know that you want a correlation matrix for several binary variables, but it would be simple to compute them in a double for loop. I hope this helps, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 2022-04-29 1:31 p.m., Lybrya Kebreab wrote: Hello, Thank you for the help already received in conducting EFAs with complex samples. I have successfully generated the EFA with svyfactanal. I have been unsuccessful in using the survey weighted data to generate the extra bells and whistles of EFA such as scree plots and parallel analyses. I noticed there is a way to create scree plots in the ggplots (or ggplots2) package, but am wondering if the svyfactanal function (or another function) in the survey package can generate these plots and subsequent parallel analyses. I also have binary variables. I can generate the EFA with the binary variables using the hetcor function within the polycor package--but without the complex sampling design. Is there a way to conduct the EFA with binary items that also allows me to apply the design weights? Thank you kindly, Lybrya Kebreab Doctoral Candidate Education-Mathematics Education Track School of Teacher Education College of Community Innovation and Education University of Central Florida [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Package nlme
Dear Rohan, Bert Gunter has already made several general useful suggestions. In addition, why did you make the variable on the left-hand side of the model a factor? Shouldn't it be a numeric variable? I hope this helps, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontarion, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 4/7/2022 6:30 AM, Rohan Richard via R-help wrote: Dear Help Desk, I am trying to perform a non-linear regression (Sigmoid curves) using the R package nlme. My field trial is a randomised complete block design (RCBD) with 3 blocks and I would like to assess the block effect in the model. Do you know how I can incorporate the block term in nlme function? So far I tried the following code and it did not work and I an error error message: # code minitab$NDVI<-as.factor(minitab$NDVI) modnlme1 <- nlme(NDVI ~ a + d / (1 + exp(-b * (DegreeDay - m)) ), data = minitab, random =a + d + b + m~ 1|Block, fixed = list(a ~ Lines1, d~Lines1,b ~ Lines1, m ~ Lines1), weights = varPower(), start=c(b=0.5,c=3,d=0.4, e=700), control = list(msMaxIter = 200)) #Error message Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : contrasts can be applied only to factors with 2 or more levels Could you please kindly help me? Thank you in advance, Best wishes, Rohan Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows Rothamsted Research is a company limited by guarantee, registered in England at Harpenden, Hertfordshire, AL5 2JQ under the registration number 2393175 and a not for profit charity number 802038. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] print only a few lines of an object
Dear Jeff, On 2022-03-23 3:36 p.m., Jeff Newmiller wrote: After-thought... Why not just use head() and tail() like normal R users do? head() and tail() are reasonable choices if there are many rows, but not if there are many columns. My first thought was your previous suggestion to redefine print() methods (although I agree with you that this isn't a good idea), but though I could get that to work for data frames, I couldn't for matrices. Adapting my preceding examples using car::brief(): > X <- matrix(rnorm(2*200), nrow = 2) > library("car") Loading required package: carData > print.data.frame <- function(x, ...){ # not recommended! + brief(x, ...) + invisible(x) + } > as.data.frame(X) 2 x 200 data.frame (19995 rows and 195 columns omitted) V1 V2 V3 . . .V199 V200 [n][n][n] [n][n] 1 -1.1810658 -0.6090037 1.00579081.23860428 0.6265465 2 -1.6395909 -0.2828005 -0.64181501.12875894 -0.7594760 3 0.2751099 0.2268473 0.22677130.64305445 1.1951732 . . . 1 1.2744054 1.0170934 -1.0172511 -0.02997537 0.7645707 2 -0.4798590 -1.8248293 -1.4664622 -0.06359483 0.7671203 > print.matrix <- function(x, ...){ # not recommended (and doesn't work)! + brief(x, ...) + invisible(x) + } > X [,1] [,2] [,3] [,4] [,5] [1,] -1.181066e+00 -6.090037e-01 1.005791e+00 3.738742e+00 -6.986169e-01 [2,] -1.639591e+00 -2.828005e-01 -6.418150e-01 -7.424275e-01 -1.415092e-01 [3,] 2.751099e-01 2.268473e-01 2.267713e-01 -6.308073e-01 7.042624e-01 [4,] -9.210181e-01 -4.617637e-01 1.523291e+00 4.003071e-01 -2.792705e-01 [5,] -6.047414e-01 1.976075e-01 6.065795e-01 -8.074581e-01 -4.089352e-01 . . . [many lines elided] [,196][,197][,198][,199] [,200] [1,] -1.453015e+00 1.347678e+00 1.189217e+00 1.238604e+00 0.6265465033 [2,] -1.693822e+00 2.689917e-01 -1.703176e-01 1.128759e+00 -0.7594760299 [3,] 1.260585e-01 6.589839e-01 -7.928987e-01 6.430545e-01 1.1951731814 [4,] -1.890582e+00 7.614779e-01 -5.726204e-01 1.090881e+00 0.9570510645 [5,] -8.667687e-01 5.365750e-01 -2.079445e+00 1.209543e+00 -0.2697400234 [ reached getOption("max.print") -- omitted 19995 rows ] So, something more complicated that I don't understand is going on with matrices. Best, John On March 23, 2022 12:31:46 PM PDT, Jeff Newmiller wrote: Sure. Re-define the print method for those objects. Can't say I recommend this, but it can be done. On March 23, 2022 11:44:01 AM PDT, Naresh Gurbuxani wrote: In an R session, when I type the name of an object, R prints the entire object (for example, a 2 x 5 data.frame). Is it possible to change the default behavior so that only the first five and last five rows are printed? Similarly, if the object is a 2 x 200 matrix, the default behavior will be to print first five and last five columns, combined with first five and last five rows. Thanks, Naresh __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cook's distance for least absolute deviation (lad) regressions
Dear Kelly and Jim, On 2022-03-20 9:40 p.m., Jim Lemon wrote: Hi Kelly, Perhaps the best place to look is the "car" package. There is a somewhat confusing reference in the "cookd" function help page to the "cooks.distance" function in the "base" package that doesn't seem to be there. Whether this is the case or not, I think you can still use the "cookd" alias. cookd() in the car package has been defunct for some time. To address the original question: One can compute Cook's distances for *any* regression model by brute-force, omitting each case i in turn and computing the Wald F or chisquare test statistic for the "hypothesis" that the deleted estimate of the regression coefficients b_{-i} is equal to the estimate b for all of the data. In a linear model, D can be computed much more efficiently based on the hatvalues, etc., without having to refit the model n times, but that's not generally the case, unless the model can be linearized (as for a GLM fit by IWLS). I'm insufficiently familiar with the computational details of LAD regression (or quantile regression more generally) to know whether a more efficient computation is possible there, but unless the data set is very large, in which case it's highly unlikely that influence of individual cases is an issue, the brute-force approach should be feasible and very easy to program. I hope this helps, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ Jim On Mon, Mar 21, 2022 at 11:57 AM Kelly Thompson wrote: I'm wanting to calculate Cook's distance for least absolute deviation (lad) regressions. Which R packages and functions offer this? Thanks! __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with data distribution
Dear Nega gupta, In the last point, I meant to say, "Finally, it's better to post to the list in plain-text email, rather than html (as the posting guide suggests)." (I accidentally inserted a "not" in this sentence.) Sorry, John On 2022-02-17 2:21 p.m., John Fox wrote: Dear Nega gupta, On 2022-02-17 1:54 p.m., Neha gupta wrote: Hello everyone I have a dataset with output variable "bug" having the following values (at the bottom of this email). My advisor asked me to provide data distribution of bugs with 0 values and bugs with more than 0 values. data = readARFF("synapse.arff") data2 = readARFF("synapse.arff") data$bug library(tidyverse) data %>% filter(bug == 0) data2 %>% filter(bug >= 1) boxplot(data2$bug, data$bug, range=0) But both the graphs are exactly the same, how is it possible? Where I am doing wrong? As it turns out, you're doing several things wrong. First, you're not using pipes and filter() correctly. That is, you don't do anything with the filtered versions of the data sets. You're apparently under the incorrect impression that filtering modifies the original data set. Second, you're greatly complicating a simple problem. You don't need to read the data twice and keep two versions of the data set. As well, processing the data with pipes and filter() is entirely unnecessary. The following code works: with(data, boxplot(bug[bug == 0], bug[bug >= 1], range=0)) Third, and most fundamentally, the parallel boxplots you're apparently trying to construct don't really make sense. The first "boxplot" is just a horizontal line at 0 and so conveys no information. Why not just plot the nonzero values if that's what you're interested in? Fourth, you didn't share your data in a convenient form. I was able to reconstruct them via bug <- scan() 0 1 0 0 0 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 0 0 0 0 1 0 0 0 0 0 0 4 1 0 0 1 0 0 0 0 0 0 1 0 3 2 0 0 0 0 3 0 0 0 0 2 0 0 0 1 0 0 0 0 1 1 1 0 0 0 0 0 0 1 1 2 1 0 1 0 0 0 2 2 1 1 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 5 0 0 0 0 0 0 7 0 0 1 0 1 1 0 2 0 3 0 1 0 0 1 0 0 0 0 0 1 1 0 0 0 0 1 0 3 2 1 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 3 0 0 1 0 1 3 0 0 0 0 0 0 0 0 1 0 4 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 1 0 0 0 0 0 data <- data.frame(bug) Finally, it's better not to post to the list in plain-text email, rather than html (as the posting guide suggests). I hope this helps, John data$bug [1] 0 1 0 0 0 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 0 0 0 0 1 0 0 0 0 0 0 4 1 0 [40] 0 1 0 0 0 0 0 0 1 0 3 2 0 0 0 0 3 0 0 0 0 2 0 0 0 1 0 0 0 0 1 1 1 0 0 0 0 0 0 [79] 1 1 2 1 0 1 0 0 0 2 2 1 1 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 5 0 0 0 0 0 0 7 0 0 1 [118] 0 1 1 0 2 0 3 0 1 0 0 1 0 0 0 0 0 1 1 0 0 0 0 1 0 3 2 1 1 0 0 0 0 0 0 0 1 0 0 [157] 0 0 0 0 0 0 0 0 0 1 0 1 0 0 3 0 0 1 0 1 3 0 0 0 0 0 0 0 0 1 0 4 1 1 0 0 0 0 1 [196] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 1 0 0 0 0 0 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with data distribution
Dear Neha gupta, I hope that I'm not overstepping my role when I say that googling solutions to specific problems isn't an inefficient way to learn a programming language, and will probably waste your time in the long run. There are many good introductions to R. Best, John On 2022-02-17 2:27 p.m., Neha gupta wrote: Dear John, thanks a lot for the detailed answer. Yes, I am not an expert in R language and when a problem comes in, I google it or post it on these forums. (I have just a little bit experience of ML in R). On Thu, Feb 17, 2022 at 8:21 PM John Fox <mailto:j...@mcmaster.ca>> wrote: Dear Nega gupta, On 2022-02-17 1:54 p.m., Neha gupta wrote: > Hello everyone > > I have a dataset with output variable "bug" having the following values (at > the bottom of this email). My advisor asked me to provide data distribution > of bugs with 0 values and bugs with more than 0 values. > > data = readARFF("synapse.arff") > data2 = readARFF("synapse.arff") > data$bug > library(tidyverse) > data %>% > filter(bug == 0) > data2 %>% > filter(bug >= 1) > boxplot(data2$bug, data$bug, range=0) > > But both the graphs are exactly the same, how is it possible? Where I am > doing wrong? As it turns out, you're doing several things wrong. First, you're not using pipes and filter() correctly. That is, you don't do anything with the filtered versions of the data sets. You're apparently under the incorrect impression that filtering modifies the original data set. Second, you're greatly complicating a simple problem. You don't need to read the data twice and keep two versions of the data set. As well, processing the data with pipes and filter() is entirely unnecessary. The following code works: with(data, boxplot(bug[bug == 0], bug[bug >= 1], range=0)) Third, and most fundamentally, the parallel boxplots you're apparently trying to construct don't really make sense. The first "boxplot" is just a horizontal line at 0 and so conveys no information. Why not just plot the nonzero values if that's what you're interested in? Fourth, you didn't share your data in a convenient form. I was able to reconstruct them via bug <- scan() 0 1 0 0 0 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 0 0 0 0 1 0 0 0 0 0 0 4 1 0 0 1 0 0 0 0 0 0 1 0 3 2 0 0 0 0 3 0 0 0 0 2 0 0 0 1 0 0 0 0 1 1 1 0 0 0 0 0 0 1 1 2 1 0 1 0 0 0 2 2 1 1 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 5 0 0 0 0 0 0 7 0 0 1 0 1 1 0 2 0 3 0 1 0 0 1 0 0 0 0 0 1 1 0 0 0 0 1 0 3 2 1 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 3 0 0 1 0 1 3 0 0 0 0 0 0 0 0 1 0 4 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 1 0 0 0 0 0 data <- data.frame(bug) Finally, it's better not to post to the list in plain-text email, rather than html (as the posting guide suggests). I hope this helps, John > > > data$bug > [1] 0 1 0 0 0 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 0 0 0 0 1 0 0 0 0 0 > 0 4 1 0 > [40] 0 1 0 0 0 0 0 0 1 0 3 2 0 0 0 0 3 0 0 0 0 2 0 0 0 1 0 0 0 0 1 1 1 0 0 > 0 0 0 0 > [79] 1 1 2 1 0 1 0 0 0 2 2 1 1 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 5 0 0 0 0 0 0 > 7 0 0 1 > [118] 0 1 1 0 2 0 3 0 1 0 0 1 0 0 0 0 0 1 1 0 0 0 0 1 0 3 2 1 1 0 0 0 0 0 0 > 0 1 0 0 > [157] 0 0 0 0 0 0 0 0 0 1 0 1 0 0 3 0 0 1 0 1 3 0 0 0 0 0 0 0 0 1 0 4 1 1 0 > 0 0 0 1 > [196] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 1 0 0 0 0 0 > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org <mailto:R-help@r-project.org> mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help <https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html <http://www.R-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ <https://socialsciences.mcmaster.ca/jfox/> -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Constructing confidence interval ellipses with R
Dear Paul, This looks like a version of the question you asked a couple of weeks ago. As I explained then, I'm pretty sure that you want concentration (i.e., data) ellipses and not confidence ellipses, which pertain to parameters (e.g., regression coefficients). Also, the hand-drawn concentration contours in your example graph don't look elliptical, so I'm not sure that you really want ellipses, but I'll assume that you do. Since as far as I can see you didn't share your data, here's a similar example using the scatterplot() function in the car package: library("car") scatterplot(prestige ~ income | type, data=Prestige, ellipse=TRUE, smooth=FALSE, regLine=FALSE) By default, this draws 50% and 95% concentration ellipses assuming bivariate normality in each group, but that and other aspects of the graph can be customized -- see ?scatterplot. I hope this helps, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 2022-01-24 4:24 p.m., Paul Bernal wrote: Dear friends, I will be sharing a dataset which has the following columns: 1. Scenario 2. Day of Transit date 3. Canal Ampliado 4. Canal Original Basically, I need to create a scatter plot diagram, with the Canal Ampliado column values in the x-axis, and the Canal Original column values in the y-axis, but also, I need to create confidence interval ellipses grouping the points on the scatterplot, based on the different scenarios. So I need to have in one graph, the scatterplot of Canal Ampliado vs Canal Original and then, on the same graph, construct the confidence interval ellipses. I will attach an image depicting what I need to accomplish, as well as the dataset, for your reference. Any help and/or guidance will be greatly appreciated. Cheers, Paul __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to create density ellipses with R
Dear Chris, I took a quick look at your document. You might be interested in Friendly, Monette, and Fox, Elliptical Insights: Understanding Statistical Methods through Elliptical Geometry, Statistical Science 2013, 28: 1–39, which is available at <https://arxiv.org/pdf/1302.4881.pdf>. I probably should cite that paper in ?car::ellipse. Best, John On 2022-01-15 12:00 p.m., Chris Evans wrote: This spurred me on to clarify my own understanding of these ellipses (and of ellipsoid hulls) and led to: https://www.psyctc.org/Rblog/posts/2022-01-15-data-ellipses-and-confidence-ellipses/ And I am, decidedly nervously, putting it here in case it's useful to you Paul or to anyone else. I think I have the basic ideas correct but of course, if any proper statisticians have corrections, I would love to receive them, off list probably unless the errors are terrible. I am entirely self-taught as a statistician, much of what I've learned has come from probably over 10, perhaps nearer 20 years on this list. Thanks to all for all the work to maintain the list and contribute to it. Chris - Original Message ----- From: "John Fox" To: "Paul Bernal" Cc: "R" Sent: Friday, 14 January, 2022 18:53:55 Subject: Re: [R] How to create density ellipses with R Dear Paul, On 2022-01-14 1:17 p.m., Paul Bernal wrote: Dear John and R community friends, To be a little bit more specific, what I need to accomplish is the creation of a confidence interval ellipse over a scatterplot at different percentiles. The confidence interval ellipses should be drawn over the scatterplot. I'm not sure what you mean. Confidence ellipses are for regression coefficients and so are on the scale of the coefficients; data (concentration) ellipses are for and on the scale of the explanatory variables. As it turns out, for a linear model, the former is the rescaled 90 degree rotation of the latter. Because the scatterplot of the (two) variables has the variables on the axes, a data ellipse but not a confidence ellipse makes sense (i.e., is in the proper units). Data ellipses are drawn by car::dataEllipse() and (as explained by Martin Maechler) cluster::ellipsoidPoints(); confidence ellipses are drawn by car::confidenceEllipse() and the various methods of ellipse::ellipse(). I hope this helps, John Any other guidance will be greatly appreciated. Cheers, Paul El vie, 14 ene 2022 a las 11:27, John Fox (mailto:j...@mcmaster.ca>>) escribió: Dear Paul, As I understand it, the ellipse package is meant for drawing confidence ellipses, not density (i.e., data) ellipses. You should be able to use ellipse::ellipse() to draw a bivariate-normal density ellipse (assuming that's what you want), but you'll have to do some computation first. You might find the dataEllipse() function in the car package more convenient (again assuming that you want bivariate-normal density contours). I hope this helps, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ <https://socialsciences.mcmaster.ca/jfox/> On 2022-01-14 10:12 a.m., Paul Bernal wrote: > Dear R friends, > > Happy new year to you all. Not quite sure if this is the proper place to > ask about this, so I apologize if it is not, and if it isn´t, maybe you can > point me to the right place. > > I would like to know if there is any R package that allows me to produce > density ellipses. Searching through the net, I came across a package called > ellipse, but I'm not sure if this is the one I should use. > > Any help and/or guidance will be greatly appreciated. > > Best regards, > > Paul > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org <mailto:R-help@r-project.org> mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help <https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html <http://www.R-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see htt
Re: [R] How to create density ellipses with R
Dear Paul, On 2022-01-14 1:17 p.m., Paul Bernal wrote: Dear John and R community friends, To be a little bit more specific, what I need to accomplish is the creation of a confidence interval ellipse over a scatterplot at different percentiles. The confidence interval ellipses should be drawn over the scatterplot. I'm not sure what you mean. Confidence ellipses are for regression coefficients and so are on the scale of the coefficients; data (concentration) ellipses are for and on the scale of the explanatory variables. As it turns out, for a linear model, the former is the rescaled 90 degree rotation of the latter. Because the scatterplot of the (two) variables has the variables on the axes, a data ellipse but not a confidence ellipse makes sense (i.e., is in the proper units). Data ellipses are drawn by car::dataEllipse() and (as explained by Martin Maechler) cluster::ellipsoidPoints(); confidence ellipses are drawn by car::confidenceEllipse() and the various methods of ellipse::ellipse(). I hope this helps, John Any other guidance will be greatly appreciated. Cheers, Paul El vie, 14 ene 2022 a las 11:27, John Fox (<mailto:j...@mcmaster.ca>>) escribió: Dear Paul, As I understand it, the ellipse package is meant for drawing confidence ellipses, not density (i.e., data) ellipses. You should be able to use ellipse::ellipse() to draw a bivariate-normal density ellipse (assuming that's what you want), but you'll have to do some computation first. You might find the dataEllipse() function in the car package more convenient (again assuming that you want bivariate-normal density contours). I hope this helps, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ <https://socialsciences.mcmaster.ca/jfox/> On 2022-01-14 10:12 a.m., Paul Bernal wrote: > Dear R friends, > > Happy new year to you all. Not quite sure if this is the proper place to > ask about this, so I apologize if it is not, and if it isn´t, maybe you can > point me to the right place. > > I would like to know if there is any R package that allows me to produce > density ellipses. Searching through the net, I came across a package called > ellipse, but I'm not sure if this is the one I should use. > > Any help and/or guidance will be greatly appreciated. > > Best regards, > > Paul > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org <mailto:R-help@r-project.org> mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help <https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html <http://www.R-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to create density ellipses with R
Dear Paul, As I understand it, the ellipse package is meant for drawing confidence ellipses, not density (i.e., data) ellipses. You should be able to use ellipse::ellipse() to draw a bivariate-normal density ellipse (assuming that's what you want), but you'll have to do some computation first. You might find the dataEllipse() function in the car package more convenient (again assuming that you want bivariate-normal density contours). I hope this helps, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 2022-01-14 10:12 a.m., Paul Bernal wrote: Dear R friends, Happy new year to you all. Not quite sure if this is the proper place to ask about this, so I apologize if it is not, and if it isn´t, maybe you can point me to the right place. I would like to know if there is any R package that allows me to produce density ellipses. Searching through the net, I came across a package called ellipse, but I'm not sure if this is the one I should use. Any help and/or guidance will be greatly appreciated. Best regards, Paul [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Time for a companion mailing list for R packages?
Dear Avi et al., Rather than proliferating R mailing lists, why not just allow questions on non-standard packages on the r-help list? (1) If people don't want to answer these questions, they don't have to. (2) Users won't necessarily find the new email list and so may post to r-help anyway, only to be told that they should have posted to another list. (3) Many of the questions currently posted to the list concern non-standard packages and most of them are answered. (4) If people prefer other sources of help (as listed on the R website "getting help" page) then they are free to use them. (5) As I read the posting guide, questions about non-standard packages aren't actually disallowed; the posting guide suggests, however, that the package maintainer be contacted first. But answers can be helpful to other users, and so it may be preferable for at least some of these questions to be asked on the list. (6) Finally, the instruction concerning non-standard packages is buried near the end of the posting guide, and users, especially new users, may not understand what the term "standard packages" means even if they find their way to the posting guide. Best, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 2022-01-12 10:27 p.m., Avi Gross via R-help wrote: Respectfully, this forum gets lots of questions that include non-base R components and especially packages in the tidyverse. Like it or not, the extended R language is far more useful and interesting for many people and especially those who do not wish to constantly reinvent the wheel. And repeatedly, we get people reminding (and sometimes chiding) others for daring to post questions or supply answers on what they see as a pure R list. They have a point. Yes, there are other places (many not being mailing lists like this one) where we can direct the questions but why can't there be an official mailing list alongside this one specifically focused on helping or just discussing R issues related partially to the use of packages. I don't mean for people making a package to share, just users who may be searching for an appropriate package or using a common package, especially the ones in the tidyverse that are NOT GOING AWAY just because some purists ... I prefer a diverse set of ways to do things and base R is NOT enough for me, nor frankly is R with all packages included as I find other languages suit my needs at times for doing various things. If this group is for purists, fine. Can we have another for the rest of us? Live and let live. -Original Message- From: Duncan Murdoch To: Kai Yang ; R-help Mailing List Sent: Wed, Jan 12, 2022 3:22 pm Subject: Re: [R] how to find the table in R studio On 12/01/2022 3:07 p.m., Kai Yang via R-help wrote: Hi all, I created a function in R. It will be generate a table "temp". I can view it in R studio, but I cannot find it on the top right window in R studio. Can someone tell me how to find it in there? Same thing for f_table. Thank you, Kai library(tidyverse) f1 <- function(indata , subgrp1){ subgrp1 <- enquo(subgrp1) indata0 <- indata temp <- indata0 %>% select(!!subgrp1) %>% arrange(!!subgrp1) %>% group_by(!!subgrp1) %>% mutate(numbering =row_number(), max=max(numbering)) view(temp) f_table <- table(temp$Species) view(f_table) } f1(iris, Species) Someone is sure to point out that this isn't an RStudio support list, but your issue is with R, not with RStudio. You created the table in f1, but you never returned it. The variable f_table is local to the function. You'd need the following code to do what you want: f1 <- function(indata , subgrp1){ subgrp1 <- enquo(subgrp1) indata0 <- indata temp <- indata0 %>% select(!!subgrp1) %>% arrange(!!subgrp1) %>% group_by(!!subgrp1) %>% mutate(numbering =row_number(), max=max(numbering)) view(temp) f_table <- table(temp$Species) view(f_table) f_table } f_table <- f1(iris, Species) It's not so easy to also make temp available. You can do it with assign(), but I think you'd be better off splitting f1 into two functions, one to create temp, and one to create f_table. Duncan Murdoch __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEAS
Re: [R] [EXTERNAL] Re: bug in Windows implementation of nlme::groupedData
Dear Melissa, Normally, in evaluating a formula an R modeling function follows the scoping rules in ?formula; that is, "A formula object has an associated environment, and this environment (rather than the parent environment) is used by model.frame to evaluate variables that are not found in the supplied data argument. "Formulas created with the ~ operator use the environment in which they were created. Formulas created with as.formula will use the env argument for their environment." So, for example, if the variables in the formula live in the environment of the calling function and if the data argument isn't used, then the variables should be found. Modifying your example a bit and calling lme() works, for example: --- snip --- > analyze_this <- function(df) { + + mean.x <- mean(df$age) + mean.y <- mean(df$height) + sd.x <- sd(df$age) + sd.y <- sd(df$height) + + x <- (df$age - mean.x) / sd.x + y <- (df$height - mean.y) / sd.y + X <- model.matrix(~ x * male * black, data = df) + dummyID <- rep(1:2, times=c(floor(nrow(X)/2), ceiling(nrow(X)/2))) + + lme(y ~ X[, -1], random= ~ 1 | dummyID) + + # groupedData(y ~ X[, -1] | dummyID) + + } > analyze_this(growthIndiana) Linear mixed-effects model fit by REML Data: NULL Log-restricted-likelihood: -2546.266 Fixed: y ~ X[, -1] (Intercept)X[, -1]x X[, -1]maleX[, -1]black -0.16086555 0.73401464 0.26303561 -0.04761425 X[, -1]x:male X[, -1]x:black X[, -1]male:black X[, -1]x:male:black 0.27517924 -0.10318100 0.21899350 0.03048160 Random effects: Formula: ~1 | dummyID (Intercept) Residual StdDev: 3.688461e-05 0.4462984 Number of Observations: 4123 Number of Groups: 2 --- snip --- Note that model.matrix() finds x and y in the environment of analyze_this(), and male and black in df. But if you unquote the line groupedData(y ~ X[, -1] | dummyID), the function fails: --- snip --- > analyze_this(growthIndiana) Error in data.frame(y = y, X = X, dummyID = dummyID) : object 'X' not found --- snip --- This suggests that groupedData() is doing something unusual (which I don't have the inclination to figure out). I'm not sure why one needs to manipulate the model matrix directly like this, but I assume that there is some coherent reason or you wouldn't be asking. Also isn't the formula for groupedData() supposed to have a *single* covariate on the right, like y ~ x | g (where y, x, and g are individual variables)? Best, John On 2022-01-07 5:29 p.m., Key, Melissa wrote: John, Thanks for your response. I agree that the definition of the data frame is poor (in my defense it came directly from the demo code, but I should have checked it more thoroughly). The good news is that your comments caused me to take a closer look at where X was defined, and I found the reason I wasn't getting the same results on my Mac and PC - that error was between keyboard and chair. There is still something funny going on though (at least relative to my previous experience with how R searches environments: If X is defined in the global environment, groupedData can find it there and use it. (this is what I'm used to) If X is defined within a function, groupedData cannot find it, even if groupedData is called within the same function. (this seems strange to me - usually parent.frame() captures information within the function environment, or so I thought) My solution at the bottom still works - and unlike groupedData, nlme allows a list as input to the data argument (or at least, doesn't check to make sure it's a data frame), so I have a working (albeit hacky) solution that actually makes more sense to me than using groupedData, but it still seems strange that the function cannot find X in its search path. Thanks again! Melissa -Original Message- From: John Fox Sent: Friday, January 7, 2022 4:35 PM To: Key, Melissa Cc: r-help@r-project.org Subject: [EXTERNAL] Re: [R] bug in Windows implementation of nlme::groupedData Dear Melissa, It seems strange to me that your code would work on any platform (it doesn't on my Mac) because the data frame you create shouldn't contain a matrix named "X" but rather columns including those originating from X. To illustrate: > X <- matrix(1:12, 4, 3) > colnames(X) <- c("a", "b", "c") > X a b c [1,] 1 5 9 [2,] 2 6 10 [3,] 3 7 11 [4,] 4 8 12 > y <- 1:4 > (D <- data.frame(y, X)) y a b c 1 1 1 5 9 2 2 2 6 10 3 3 3 7 11 4 4 4 8 12 > str(D) 'data.frame': 4 obs. of 4 variables: $ y: int 1 2 3 4 $ a: int 1 2 3 4 $ b: int 5 6 7 8 $ c: int 9 10 11 12 My session info: > sessionInfo() R
Re: [R] bug in Windows implementation of nlme::groupedData
Dear Melissa, It seems strange to me that your code would work on any platform (it doesn't on my Mac) because the data frame you create shouldn't contain a matrix named "X" but rather columns including those originating from X. To illustrate: > X <- matrix(1:12, 4, 3) > colnames(X) <- c("a", "b", "c") > X a b c [1,] 1 5 9 [2,] 2 6 10 [3,] 3 7 11 [4,] 4 8 12 > y <- 1:4 > (D <- data.frame(y, X)) y a b c 1 1 1 5 9 2 2 2 6 10 3 3 3 7 11 4 4 4 8 12 > str(D) 'data.frame': 4 obs. of 4 variables: $ y: int 1 2 3 4 $ a: int 1 2 3 4 $ b: int 5 6 7 8 $ c: int 9 10 11 12 My session info: > sessionInfo() R version 4.1.2 (2021-11-01) Platform: aarch64-apple-darwin20 (64-bit) Running under: macOS Monterey 12.1 Matrix products: default LAPACK: /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libRlapack.dylib locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] nlme_3.1-153 HRW_1.0-5 loaded via a namespace (and not attached): [1] compiler_4.1.2 tools_4.1.2KernSmooth_2.23-20 splines_4.1.2 [5] grid_4.1.2 lattice_0.20-45 I hope this helps, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 2022-01-07 11:23 a.m., Key, Melissa wrote: I am trying to replicate a semi-parametric analysis described in Harezlak, Jaroslaw, David Ruppert, and Matt P. Wand. Semiparametric regression with R. New York, NY: Springer, 2018. (https://link.springer.com/book/10.1007%2F978-1-4939-8853-2). I can successfully run the analysis, but now I'm trying to move it into my workflow, which requires that the analysis be conducted within a function (using targets package), and the `groupedData` function now fails with an error that it cannot find the `X` matrix (see reprex below). I've tried the reprex on both my personal Mac (where it works??) and on windows machines (where it does not) - so the problem is likely specific to Windows computers (yes, this seems weird to me too). All packages have been updated, and I'm running the latest version of R on all machines. Reprex: library(HRW) # contains example data and ZOSull function library(nlme) data(growthIndiana) analyze_this <- function(df) { mean.x <- mean(df$age) mean.y <- mean(df$height) sd.x <- sd(df$age) sd.y <- sd(df$height) df$x <- (df$age - mean.x) / sd.x df$y <- (df$height - mean.y) / sd.y X <- model.matrix(~ x * male * black, data = df) dummyID <- rep(1, length(nrow(X))) grouped_data <- groupedData(y ~ X[,-1]|rep(1, length = nrow(X)), data = data.frame(y = df$y, X, dummyID)) } # doesn't work on Windows machine, does work on the Mac analyze_this(growthIndiana) #> Error in eval(aux[[2]], object): object 'X' not found # does work df <- growthIndiana mean.x <- mean(df$age) mean.y <- mean(df$height) sd.x <- sd(df$age) sd.y <- sd(df$height) df$x <- (df$age - mean.x) / sd.x df$y <- (df$height - mean.y) / sd.y X <- model.matrix(~ x * male * black, data = df) dummyID <- rep(1, length(nrow(X))) grouped_data <- groupedData(y ~ X[,-1]|rep(1, length = nrow(X)), data = data.frame(y = df$y, X, dummyID)) # attempted work-around. analyze_this2 <- function(df) { num.global.knots = 20 num.subject.knots = 10 mean.x <- mean(df$age) mean.y <- mean(df$height) sd.x <- sd(df$age) sd.y <- sd(df$height) df$x <- (df$age - mean.x) / sd.x df$y <- (df$height - mean.y) / sd.y X <- model.matrix(~ x * male * black, data = df) dummyID <- rep(1, length(nrow(X))) # grouped_data <- groupedData(y ~ X[,-1]|rep(1, length = nrow(X)), data = data.frame(y = df$y, X, dummyID)) global.knots = quantile(unique(df$x), seq(0, 1, length = num.global.knots + 2)[-c(1, num.global.knots + 2)]) subject.knots = quantile(unique(df$x), seq(0, 1, length = num.subject.knots + 2)[-c(1, num.subject.knots + 2)]) Z.global <- ZOSull(df$x, range.x = range(df$x), global.knots) Z.group <- df$black * Z.global Z.subject <- ZOSull(df$x, range.x = range(df$x), subject.knots) Zblock <- list( dummyID = pdIdent(~ 0 + Z.global), dummyID = pdIdent(~ 0 + Z.group), idnum = pdSymm(~ x), idnum = pdIdent(~ 0 + Z.subject) ) df$dummyID <- dummyID tmp_data <- c( df, X = list(X), Z.global = list(Z.global), Z.group = list(Z.global), Z.subject = list(Z.subject) ) fit <- lme(y ~ 0 + X, data = tmp_data, random = Zblock ) } # this works (warning - lme takes awhile to fit) analyze_this2(growthIndiana) sessionInfo() #> R version 4.1.2 (2021-11-01) #&g
Re: [R] How to use ifelse without invoking warnings
Dear Ravi, On 2021-10-08 8:21 a.m., Ravi Varadhan wrote: Thank you to Bert, Sarah, and John. I did consider suppressing warnings, but I felt that there must be a more principled approach. While John's solution is what I would prefer, I cannot help but wonder why `ifelse' was not constructed to avoid this behavior. The conditional if () else, which works on an individual logical value, uses lazy evaluation and so can avoid the problem you encountered. My guess is that implementing lazy evaluation for the vectorized ifelse() would incur too high a computational overhead for large arguments. Best, John Thanks & Best regards, Ravi ---- *From:* John Fox *Sent:* Thursday, October 7, 2021 2:00 PM *To:* Ravi Varadhan *Cc:* R-Help *Subject:* Re: [R] How to use ifelse without invoking warnings External Email - Use Caution Dear Ravi, It's already been suggested that you could disable warnings, but that's risky in case there's a warning that you didn't anticipate. Here's a different approach: > kk <- k[k >= -1 & k <= n] > ans <- numeric(length(k)) > ans[k > n] <- 1 > ans[k >= -1 & k <= n] <- pbeta(p, kk + 1, n - kk, lower.tail=FALSE) > ans [1] 0.0 0.006821826 0.254991551 1.0 BTW, I don't think that you mentioned that p = 0.3, but that seems apparent from the output you showed. I hope this helps, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsocialsciences.mcmaster.ca%2Fjfox%2F&data=04%7C01%7Cravi.varadhan%40jhu.edu%7Cfd882e7c4f4349db34e108d989bc6a9f%7C9fa4f438b1e6473b803f86f8aedf0dec%7C0%7C0%7C637692265160038474%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=Q33yXm36BwEVKUWO72CWFpSUx7gcEEXhM3qFi7n78ZM%3D&reserved=0 <https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsocialsciences.mcmaster.ca%2Fjfox%2F&data=04%7C01%7Cravi.varadhan%40jhu.edu%7Cfd882e7c4f4349db34e108d989bc6a9f%7C9fa4f438b1e6473b803f86f8aedf0dec%7C0%7C0%7C637692265160038474%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=Q33yXm36BwEVKUWO72CWFpSUx7gcEEXhM3qFi7n78ZM%3D&reserved=0> On 2021-10-07 12:29 p.m., Ravi Varadhan via R-help wrote: Hi, I would like to execute the following vectorized calculation: ans <- ifelse (k >= -1 & k <= n, pbeta(p, k+1, n-k, lower.tail = FALSE), ifelse (k < -1, 0, 1) ) For example: k <- c(-1.2,-0.5, 1.5, 10.4) n <- 10 ans <- ifelse (k >= -1 & k <= n, pbeta(p,k+1,n-k,lower.tail=FALSE), ifelse (k < -1, 0, 1) ) Warning message: In pbeta(p, k + 1, n - k, lower.tail = FALSE) : NaNs produced print(ans) [1] 0.0 0.006821826 0.254991551 1.0 The answer is correct. However, I would like to eliminate the annoying warnings. Is there a better way to do this? Thank you, Ravi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=04%7C01%7Cravi.varadhan%40jhu.edu%7Cfd882e7c4f4349db34e108d989bc6a9f%7C9fa4f438b1e6473b803f86f8aedf0dec%7C0%7C0%7C637692265160048428%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=FXX%2B4zNT0JHBnDFO5dXBDQ484oQF1EK5%2Fa0dG9P%2F4k4%3D&reserved=0 <https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=04%7C01%7Cravi.varadhan%40jhu.edu%7Cfd882e7c4f4349db34e108d989bc6a9f%7C9fa4f438b1e6473b803f86f8aedf0dec%7C0%7C0%7C637692265160048428%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=FXX%2B4zNT0JHBnDFO5dXBDQ484oQF1EK5%2Fa0dG9P%2F4k4%3D&reserved=0> PLEASE do read the posting guide https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=04%7C01%7Cravi.varadhan%40jhu.edu%7Cfd882e7c4f4349db34e108d989bc6a9f%7C9fa4f438b1e6473b803f86f8aedf0dec%7C0%7C0%7C637692265160048428%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=ss2ohzJIY6qj0eAexk4yVzTzbjXxK5VZNors0GpsbA0%3D&reserved=0 <https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=04%7C01%7Cravi.varadhan%40jhu.edu%7Cfd882e7c4f4349db34e108d989bc6a9f%7C9fa4f438b1e6473b803f86f8aedf0dec%7C0%7C0%7C637692265160048428%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=ss2
Re: [R] How to use ifelse without invoking warnings
Dear Ravi, It's already been suggested that you could disable warnings, but that's risky in case there's a warning that you didn't anticipate. Here's a different approach: > kk <- k[k >= -1 & k <= n] > ans <- numeric(length(k)) > ans[k > n] <- 1 > ans[k >= -1 & k <= n] <- pbeta(p, kk + 1, n - kk, lower.tail=FALSE) > ans [1] 0.0 0.006821826 0.254991551 1.0 BTW, I don't think that you mentioned that p = 0.3, but that seems apparent from the output you showed. I hope this helps, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 2021-10-07 12:29 p.m., Ravi Varadhan via R-help wrote: Hi, I would like to execute the following vectorized calculation: ans <- ifelse (k >= -1 & k <= n, pbeta(p, k+1, n-k, lower.tail = FALSE), ifelse (k < -1, 0, 1) ) For example: k <- c(-1.2,-0.5, 1.5, 10.4) n <- 10 ans <- ifelse (k >= -1 & k <= n, pbeta(p,k+1,n-k,lower.tail=FALSE), ifelse (k < -1, 0, 1) ) Warning message: In pbeta(p, k + 1, n - k, lower.tail = FALSE) : NaNs produced print(ans) [1] 0.0 0.006821826 0.254991551 1.0 The answer is correct. However, I would like to eliminate the annoying warnings. Is there a better way to do this? Thank you, Ravi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error msg trying to load R Commander with an older R edition...
Dear Brian, On 2021-09-13 9:33 a.m., Brian Lunergan wrote: Hi folks: I'm running Linux Mint 19.3 on my machine. Tried to install a more recent edition of R but I couldn't seem to get it working so I pulled it off and went with a good, basic install of the edition available through the software manager. So... I'm running version 3.4.4. Mucking about with the attempt at a newer edition seems to have left some excess baggage behind. When I loaded R Commander and attempted to run it I received the following error message. Error: package or namespace load failed for ‘car’ in readRDS(pfile): cannot read workspace version 3 written by R 3.6.2; need R 3.5.0 or newer During startup - Warning message: package ‘Rcmdr’ in options("defaultPackages") was not found I get a similar message in Rkward when I try to load any more packages. Is there any solution for this? Any "leftovers" I can track down and delete? Any assistance would be greatly appreciated. It's hard to know exactly how many things are wrong here, but one problem seems to be that you saved the R workspace in the newer version of R, and that the older version is trying to load the saved workspace, which is an incompatible format. The workspace is probably saved in the file .RData in your R home directory. If that's the case, then you should see a message to this effect when R starts up. I'd begin by simply deleting this file. Then, if the Rcmdr package fails to load with an error indicating that car or another package is missing, I'd try installing the missing package(s). Finally, you might be better off persevering in your attempt to install the current version of R rather than the quite old version that you're trying get working. I hope this helps, John Kind regards... __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can't add error bars to existing line graph
Dear Bruno, There are (at least) two errors here: (1) I think that you misunderstand how interaction.plot(), and more generally R base graphics, work. interaction.plot() doesn't return a graphics object, but rather draws on a graphics device as a side effect. Of course, interaction.plot() is a function and so it must return something -- it invisibly returns NULL. (2) I assume that you independently computed Scaphmeans and Scaphse, although you didn't include the corresponding code in your message. In any event, the arrows() function generally takes 4 arguments (x0, y0, x1, y1), specifying the x and y coordinates of the endpoints of the arrows. It's true that because your "arrows" are intended to be vertical, you need not specify x1, which defaults to x0, but the other 3 arguments are necessary. See ?arrows for details. I hope this helps, John John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 2021-06-15 9:02 a.m., Bruno.Salonen wrote: Hi all, I'm trying to add error bars to an existing line graph in base R. The basic line graph comes up just fine, but it does not show my error bars... Data frame = readscaphfileNEW Plot name = SCAPHLINEGRAPHNEW x axis = TEMP y axis = SCAPH.BPM Tracer = Year (SAME AS 'EXPERIMENT) Scaphmeans = means of SCAPH.BPM Scaphse = standard error of SCAPH.BPM Here is the code.. SCAPHLINEGRAPHNEW <- interaction.plot(readscaphfileNEW$TEMP, readscaphfileNEW$EXPERIMENT, readscaphfileNEW$SCAPH.BPM, xlab = "Temperature (°C)", ylab = "Scaphognathite Rate (BPM)", main = "Scaphognathite", ylim = c(0,300), trace.label = "Year", type = "b", pch = c(19,17), fixed = TRUE) arrows(SCAPHLINEGRAPHNEW,Scaphmeans+Scaphse,SCAPHLINEGRAPHNEW,Scaphmeans-Scaphse,code=3, angle=90, length=0.1) Why are my error bars not showing? Is the 'arrows' line wrong? Thanks a million for your help, everybody. Here is my data set: readscaphfileNEW EXPERIMENT TEMP SCAPH.BPM 12021 1282 22021 1258 32021 1278 42021 1259 52021 1280 62021 12 100 72021 1261 82021 12 103 92021 1261 10 2021 17 100 11 2021 1770 12 2021 1783 13 2021 1773 14 2021 17 143 15 2021 17 103 16 2021 1773 17 2021 17 158 18 2021 1795 19 2021 1780 20 1939 12 158 21 1939 12 148 22 1939 12 152 23 1939 12 148 24 1939 12 160 25 1939 12 168 26 1939 12 152 27 1939 12 150 28 1939 12 187 29 1939 17 300 30 1939 17 302 31 1939 17 291 32 1939 17 240 33 1939 17 253 34 1939 17 207 35 1939 17 184 36 1939 17 224 37 1939 17 242 38 1939 17 236 Bruno -- Sent from: https://r.789695.n4.nabble.com/R-help-f789696.html __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] calculating area of ellipse
Dear Jeff, I don't think that it would be sensible to claim that it *never* makes sense to multiply quantities measured in different units, but rather that this would rarely make sense for regression coefficients. James might have a justification for finding the area, but it is still, I think, reasonable to point out that doing so may be problematic. With respect to ratios of areas: I apologize if my examples were cryptic. Imagine, for example, that the same regression model is fit to two groups and joint-confidence ellipse for two coefficients computed for each. The ratio of the two areas would reflect the relative precision of the estimates in the two groups, which is unaffected by the units of measurement of the coefficients. This is also the idea behind generalized variance inflation, where the comparison is to a "utopian" situation in which the parameters are uncorrelated. For details, see help("vif", package="car") and in particular Fox, J. and Monette, G. (1992) Generalized collinearity diagnostics. JASA, 87, 178–183. Best, John On 2021-05-11 10:48 a.m., Jeff Newmiller wrote: The area is a product, not a ratio. There are certainly examples out there of meaningful products of different units, such as distance * force (work) or power " time (work). If you choose to form a ratio with the area as numerator, you could conceivably obtain the numerator with force snd distance and then meaningfully form a ratio with time (power). So this asserted requirement as to homogeneous units seems inaccurate. But without context I don't know if any of this will aid in interpretation of variance for the OP. On May 11, 2021 7:30:22 AM PDT, John Fox wrote: Dear Stephen, On 2021-05-11 10:20 a.m., Stephen Ellison wrote: In doing meta-analysis of diagnostic accuracy I produce ellipses of confidence and prediction intervals in two dimensions. How can I calculate the area of the ellipse in ggplot2 or base R? There are established formulae for ellipse area, but I am curious: in a 2-d ellipse with different quantities (eg coefficients for salary and age) represented by the different dimensions, what does 'area' mean? I answered James's question narrowly, but the point you raise is correct -- the area isn't directly interpretable unless the coefficients are measured in the same units. It still may be possible to compare areas of ellipsoids for, say, different regressions with the same predictors, as ratios, however, since these ratios would be unaffected by rescaling the coefficients. The generalization of this idea to ellipsoids of any dimension is the basis for the generalized variance-inflation factors computed by the vif() function in the car package. Best, John John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ S *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] calculating area of ellipse
Dear Stephen, On 2021-05-11 10:20 a.m., Stephen Ellison wrote: >> In doing meta-analysis of diagnostic accuracy I produce ellipses of confidence >> and prediction intervals in two dimensions. How can I calculate the area of >> the ellipse in ggplot2 or base R? > > There are established formulae for ellipse area, but I am curious: in a 2-d ellipse with different quantities (eg coefficients for salary and age) represented by the different dimensions, what does 'area' mean? I answered James's question narrowly, but the point you raise is correct -- the area isn't directly interpretable unless the coefficients are measured in the same units. It still may be possible to compare areas of ellipsoids for, say, different regressions with the same predictors, as ratios, however, since these ratios would be unaffected by rescaling the coefficients. The generalization of this idea to ellipsoids of any dimension is the basis for the generalized variance-inflation factors computed by the vif() function in the car package. Best, John John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ > > S > > > *** > This email and any attachments are confidential. Any use...{{dropped:8}} > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] calculating area of ellipse
Dear David and Jim, As I explained yesterday, a confidence ellipse is based on a quadratic form in the inverse of the covariance matrix of the estimated coefficients. When the coefficients are uncorrelated, the axes of the ellipse are parallel to the parameter axes, and the radii of the ellipse are just a constant times the inverses of the standard deviations of the coefficients. The constant is typically the square root of twice a corresponding quantile (say, 0.95) of an F distribution with 2 numerator df, or a quantile of the chi-square distribution with 2 df. In the more general case, the confidence ellipse is tilted, and the radii correspond to the square roots of the eigenvalues of the coefficient covariance matrix, again multiplied by a constant. That explains the result I gave yesterday based on the determinant of the coefficient covariance matrix, which is the product of its eigenvalues. These results generalize readily to ellipsoids in higher dimensions, and to degenerate cases, such as perfectly correlated coefficients. For more on the statistics of ellipses, see <http://euclid.psych.yorku.ca/datavis/papers/ellipses-STS402.pdf>. Best, John John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 2021-05-06 10:31 p.m., David Winsemius wrote: On 5/6/21 6:29 PM, Jim Lemon wrote: Hi James, If the result contains the major (a) and minor (b) axes of the ellipse, it's easy: area<-pi*a*b ITYM semi-major and semi-minor axes. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] calculating area of ellipse
Dear James, To mix notation a bit, presumably the (border of the) confidence ellipse is of the form (b - beta)'V(b)^-1 (b - beta) = c, where V(b) is the covariance matrix of b and c is a constant. Then the area of the ellipse is pi*c^2*sqrt(det(V(b))). It shouldn't be hard to translate that into R code. I hope this helps, John John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 2021-05-06 7:24 a.m., james meyer wrote: In doing meta-analysis of diagnostic accuracy I produce ellipses of confidence and prediction intervals in two dimensions. How can I calculate the area of the ellipse in ggplot2 or base R? thank you James Meyer __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Contrasts in coxph
Dear John, It's not clear to me exactly what you have in mind, but car::linearHypothesis(), multcomp::glht(), and the emmeans package work with Cox models. I expect there are functions in other packages that will work too. Here's an example, surely simpler than what you have in mind, but you can probably adapt it: -- snip - > library("survival") > library("car") Loading required package: carData > mod.allison <- coxph(Surv(week, arrest) ~ +fin + age + race + wexp + mar + paro + prio, + data=Rossi) > mod.allison Call: coxph(formula = Surv(week, arrest) ~ fin + age + race + wexp + mar + paro + prio, data = Rossi) coef exp(coef) se(coef) z p finyes -0.37942 0.68426 0.19138 -1.983 0.04742 age-0.05744 0.94418 0.02200 -2.611 0.00903 raceother -0.31390 0.73059 0.30799 -1.019 0.30812 wexpyes-0.14980 0.86088 0.21222 -0.706 0.48029 marnot married 0.43370 1.54296 0.38187 1.136 0.25606 paroyes-0.08487 0.91863 0.19576 -0.434 0.66461 prio0.09150 1.09581 0.02865 3.194 0.00140 Likelihood ratio test=33.27 on 7 df, p=2.362e-05 n= 432, number of events= 114 > > linearHypothesis(mod.allison, "finyes") Linear hypothesis test Hypothesis: finyes = 0 Model 1: restricted model Model 2: Surv(week, arrest) ~ fin + age + race + wexp + mar + paro + prio Res.Df Df Chisq Pr(>Chisq) 1426 2425 1 3.93060.04742 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 > > library("multcomp") Loading required package: mvtnorm Loading required package: TH.data Loading required package: MASS Attaching package: ‘TH.data’ The following object is masked from ‘package:MASS’: geyser > summary(glht(mod.allison, "finyes=0")) Simultaneous Tests for General Linear Hypotheses Fit: coxph(formula = Surv(week, arrest) ~ fin + age + race + wexp + mar + paro + prio, data = Rossi) Linear Hypotheses: Estimate Std. Error z value Pr(>|z|) finyes == 0 -0.3794 0.1914 -1.983 0.0474 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Adjusted p values reported -- single-step method) > > library(emmeans) > pairs(emmeans(mod.allison, ~ fin)) contrast estimateSE df z.ratio p.value no - yes0.379 0.191 Inf 1.983 0.0474 -- snip - Results are averaged over the levels of: race, wexp, mar, paro Results are given on the log (not the response) scale. John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 2021-04-05 11:28 p.m., Sorkin, John wrote: I would like to define contrasts on the output of a coxph function. It appears that the contrast function from the contrast library does not have a method defined that will allow computation of contrasts on a coxph object. How does one define and evaluate contrasts for a cox model? Thank you, John John David Sorkin M.D., Ph.D. Professor of Medicine Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plotting adjusted KM curve
On 2021-04-04 10:45 p.m., John Fox wrote: Dear John, I think that what you're looking for is plot(survfit(fit1Cox, newdata=data.frame(age=rep(65, 2), sex=factor("female", "male" Whoops, that should be plot(survfit(fit1Cox, newdata=data.frame(age=rep(65, 2), sex=factor(c("female", "male") John assuming, of course, that sex is a factor with levels "female" and "male". I hope this helps, John John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 2021-04-04 9:34 p.m., Sorkin, John wrote: Colleagues, I am using the coxph to model survival time. How do I plot an adjusted Kaplan Meir plot resulting from coxph? The code I would like to run would start with: # run cox model fit1Cox <- coxph(surv_object ~age+sex,data=mydata) I have no idea what would follow. I would like to plot adjusted KM curves for men vs. women at age 65. Thank you, John [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plotting adjusted KM curve
Dear John, I think that what you're looking for is plot(survfit(fit1Cox, newdata=data.frame(age=rep(65, 2), sex=factor("female", "male" assuming, of course, that sex is a factor with levels "female" and "male". I hope this helps, John John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 2021-04-04 9:34 p.m., Sorkin, John wrote: Colleagues, I am using the coxph to model survival time. How do I plot an adjusted Kaplan Meir plot resulting from coxph? The code I would like to run would start with: # run cox model fit1Cox <- coxph(surv_object ~age+sex,data=mydata) I have no idea what would follow. I would like to plot adjusted KM curves for men vs. women at age 65. Thank you, John [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] using eigen function in MAP and purr
Dear V. K. Chetty, Perhaps I'm missing something but why wouldn't you just use a list of matrices, as in the following? -- snip - > set.seed(123) # for reproducibility > (Matrices <- lapply(1:3, function(i) matrix(sample(1:50, 4), 2, 2))) [[1]] [,1] [,2] [1,] 31 14 [2,] 153 [[2]] [,1] [,2] [1,] 42 37 [2,] 43 14 [[3]] [,1] [,2] [1,] 25 27 [2,] 265 > (Eigenvalues <- lapply(Matrices, function(x) eigen(x, only.values=TRUE)$values)) [[1]] [1] 37.149442 -3.149442 [[2]] [1] 70.27292 -14.27292 [[3]] [1] 43.3196 -13.3196 -- snip - I hope this helps, John John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 2021-03-29 5:28 p.m., Veerappa Chetty wrote: I want to use map and purr functions to compute eigen values for 3000 matrices. Each matrix has 2 rows and 2 columns. The following code does not work. test.dat<- tibble(ID=c(1,2),a=c(1,1),b=c(1,1),c=c(2,2),d=c(4,3)) test.out<-test.dat %>% nest(-ID) %>% mutate(fit = purrr::map(data,~ function(x) eigen(matrix(x,2,2)), data=.)) This must be a trivial question for current young practitioners ( In my 9 th decade, I am having fun using R markdown and I am trying to continue my research!) I would greatly appreciate any help. Thanks. V.K.Chetty __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error using nls function
Dear David, I'm afraid that this doesn't make much sense -- that is, I expect that you're not doing what you intended. First, sin(2*pi*t) and cos(2*pi*t) are each invariant: > sin(2*pi*t) [1] -2.449294e-16 -4.898587e-16 -7.347881e-16 -9.797174e-16 -1.224647e-15 -1.469576e-15 [7] -1.714506e-15 -1.959435e-15 -2.204364e-15 -2.449294e-15 -9.799650e-15 -2.939152e-15 > cos(2*pi*t) [1] 1 1 1 1 1 1 1 1 1 1 1 1 Second, as formulated the model is linear in the parameters. I hope this helps, John John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 2021-03-26 8:31 a.m., David E.S. wrote: I'm trying to fit a harmonic equation to my data, but when I'm applying the nls function, R gives me the following error: Error in nlsModel(formula, mf, start, wts) : singular gradient matrix at initial parameter estimates. All posts I've seen, related to this error, are of exponential functions, where a linearization is used to fix this error, but in this case, I'm not able to solve it in this way. I tried to use other starting points but it still not working. y <- c(20.91676, 20.65219, 20.39272, 20.58692, 21.64712, 23.30965, 23.35657, 24.22724, 24.83439, 24.34865, 23.13173, 21.96117) t <- c(1, 2, 3, 4 , 5 , 6, 7, 8, 9, 10, 11, 12) # Fitting function fit <- function(x, a, b, c) {a+b*sin(2*pi*x)+c*cos(2*pi*x)} res <- nls(y ~ fit(t, a, b, c), data=data.frame(t,y), start = list(a=1,b=0, c=1)) Can you help me? Thanks! David -- Sent from: https://r.789695.n4.nabble.com/R-help-f789696.html __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] library(hms)
Dear Greg, As I explained to you in a private email, and as others have told you, there is no Install.libraries() command, nor is there an install.libraries(0 command, but there is an install.packages() command. So install.packages("hms") should work, on a Mac or on any other internet-connected computer on which R runs -- as you've also been told by others, this is not a Mac-specific issue. Note that the argument to install.packages must be quoted. See ?install.packages for details. I'll also repeat the advice that I gave you privately to learn something about R before you try to use it, possibly starting with the "An Introduction to R" manual that ships with the standard R distribution. Best, John John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 2021-03-17 1:07 p.m., Gregory Coats wrote: On my MacBook, I do not have, and do not know how to install, library(hms). Greg Coats library(hms) Error in library(hms) : there is no package called ‘hms’ Install.libraries(“hms”) Error: unexpected input in "Install.libraries(“" __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to plot dates
Dear Greg, Coordinate plots typically have a horizontal (x) and vertical (y) axis. The command ggplot(myDat, aes(x=datetime, y = datetime)) + geom_point() works, but I doubt that it produces what you want. You have only one variable in your data set -- datetime -- so it's not obvious what you want to do. If you can't clearly describe the structure of the plot you intend to draw, it's doubtful that I or anyone else can help you. Best, John John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 2021-03-16 2:56 p.m., Gregory Coats via R-help wrote: I need a plot that shows the date and time that each event started. This ggplot command was publicly given to me via this R Help Mailing LIst. But the result of issuing the ggplot command is an Error in FUN message. ggplot(myDat, aes(x=datetime, y = Y_Var)) + geom_point() Error in FUN(X[[i]], ...) : object 'Y_Var' not found Greg Coats On Mar 16, 2021, at 2:18 PM, John Fox wrote: There is no variable named Y_Var in your data set. I suspect that it's intended to be a generic specification in the recipe you were apparently given. In fact, there appears to be only one variable in myDat and that's datetime. What is it that you're trying to do? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to plot dates
Dear Greg, There is no variable named Y_Var in your data set. I suspect that it's intended to be a generic specification in the recipe you were apparently given. In fact, there appears to be only one variable in myDat and that's datetime. What is it that you're trying to do? A more general comment: If I'm correct and you're just following a recipe, that's a recipe for problems. You'd probably be more successful if you tried to learn how ggplot(), etc., work. My apologies if I'm misinterpreting the source of your difficulties. I hope this helps, John John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 2021-03-16 12:21 p.m., Gregory Coats via R-help wrote: Sarah, Thank you. Yes, now as.POSIXct works. But the ggplot command I was told to use yields an Error message, and there is no output plot. Please help me. Greg library(ggplot2) myDat <- read.table(text = + "datetime + 2021-03-11 10:00:00 + 2021-03-11 14:17:00 + 2021-03-12 05:16:46 + 2021-03-12 09:17:02 + 2021-03-12 13:31:43 + 2021-03-12 22:00:32 + 2021-03-13 09:21:43", + sep = ",", header = TRUE) head(myDat) datetime 1 2021-03-11 10:00:00 2 2021-03-11 14:17 3 2021-03-12 05:16:46 4 2021-03-12 09:17:02 5 2021-03-12 13:31:43 6 2021-03-12 22:00:32 myDat$datetime <- as.POSIXct(myDat$datetime, tz = "", format ="%Y-%M-%d %H:%M:%OS”) ggplot(myDat, aes(x=datetime, y = Y_Var)) + geom_point() Error in FUN(X[[i]], ...) : object 'Y_Var' not found On Mar 16, 2021, at 9:36 AM, Sarah Goslee wrote: Hi, It doesn't have anything to do with having a Mac - you have POSIX. It's because something is wrong with your data import. Looking at the head() output you provided, it looks like your data file does NOT have a header, because there's no datetime column, and the column name is actually X2021.03.11.10.00.0 So you specified a nonexistent column, and got a zero-length answer. With correct specification, the as.POSIXct function works as expected on Mac: myDat <- read.table(text = "datetime 2021-03-11 10:00:00 2021-03-11 14:17:00 2021-03-12 05:16:46 2021-03-12 09:17:02 2021-03-12 13:31:43 2021-03-12 22:00:32 2021-03-13 09:21:43", sep = ",", header = TRUE) myDat$datetime <- as.POSIXct(myDat$datetime, tz = "", format = "%Y-%M-%d %H:%M:%OS") Sarah On Tue, Mar 16, 2021 at 9:26 AM Gregory Coats via R-help wrote: My computer is an Apple MacBook. I do not have POSIX. The command myDat$datetime <- as.POSIXct(myDat$datetime, tz = "", format = "%Y-%M-%d %H:%M:%OS") yields the error Error in `$<-.data.frame`(`*tmp*`, datetime, value = numeric(0)) : replacement has 0 rows, data has 13 Please advise, How to proceed? Greg Coats library(ggplot2) # Read a txt file on the Desktop, named "myDat.txt" myDat <- read.delim("~/Desktop/myDat.txt", header = TRUE, sep = ",") head(myDat) X2021.03.11.10.00.00 1 2021-03-11 14:17:00 2 2021-03-12 05:16:46 3 2021-03-12 09:17:02 4 2021-03-12 13:31:43 5 2021-03-12 22:00:32 6 2021-03-13 09:21:43 # convert data to date time object myDat$datetime <- as.POSIXct(myDat$datetime, tz = "", format = "%Y-%M-%d %H:%M:%OS") Error in `$<-.data.frame`(`*tmp*`, datetime, value = numeric(0)) : replacement has 0 rows, data has 13 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Sarah Goslee (she/her) http://www.numberwright.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mpfr function in Rmpfr crashes R
Dear Roger, This works perfectly fine for me on an apparently similar system, with the exceptions that I'm running R 4.0.4, have many fewer packages loaded, and am in a slightly different locale: --- snip > Rmpfr::mpfr(pi, 120) 1 'mpfr' number of precision 120 bits [1] 3.1415926535897931159979634685441851616 > sessionInfo() R version 4.0.4 (2021-02-15) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS Big Sur 10.16 Matrix products: default LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib locale: [1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] Rmpfr_0.8-2 gmp_0.6-2 loaded via a namespace (and not attached): [1] compiler_4.0.4htmltools_0.5.1.1 tools_4.0.4 yaml_2.2.1 rmarkdown_2.6 [6] knitr_1.31xfun_0.21 digest_0.6.27 packrat_0.5.0 rlang_0.4.10 [11] evaluate_0.14 --- snip You might try updating R or running Rmpfr in a cleaner session. I hope this helps, John John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 2021-03-06 7:07 p.m., Roger Bos wrote: All, The following code crashes by R on my mac with a message "R session aborted. A fatal error occured". ``` library(Rmpfr) Rmpfr::mpfr(pi, 120) ``` Does anyone have any suggestions? My session info is below: R version 4.0.3 (2020-10-10) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS Big Sur 10.16 Matrix products: default LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] datasets utils stats graphics grDevices methods base other attached packages: [1] alphavantager_0.1.2 googlesheets4_0.2.0 googledrive_1.0.1 clipr_0.7.1 [5] jsonlite_1.7.2 stringi_1.5.3 dtplyr_1.0.1 data.table_1.13.6 [9] dplyr_1.0.4 plyr_1.8.6 testthat_3.0.1 lubridate_1.7.9.2 [13] timeDate_3043.102 sendmailR_1.2-1 rmarkdown_2.6 devtools_2.3.2 [17] usethis_2.0.0 xts_0.12.1 zoo_1.8-8 MASS_7.3-53 [21] fortunes_1.5-4 loaded via a namespace (and not attached): [1] tinytex_0.29 tidyselect_1.1.0 xfun_0.20 remotes_2.2.0 purrr_0.3.4 [6] gargle_0.5.0 lattice_0.20-41 generics_0.1.0vctrs_0.3.6 htmltools_0.5.1.1 [11] base64enc_0.1-3 rlang_0.4.10 pkgbuild_1.2.0pillar_1.4.7 glue_1.4.2 [16] withr_2.4.1 DBI_1.1.1 sessioninfo_1.1.1 lifecycle_0.2.0 cellranger_1.1.0 [21] evaluate_0.14 memoise_2.0.0 knitr_1.31callr_3.5.1 fastmap_1.1.0 [26] ps_1.5.0 curl_4.3 Rcpp_1.0.6openssl_1.4.3 cachem_1.0.1 [31] desc_1.2.0pkgload_1.1.0 fs_1.5.0 askpass_1.1 digest_0.6.27 [36] processx_3.4.5grid_4.0.3rprojroot_2.0.2 cli_2.3.0 tools_4.0.3 [41] magrittr_2.0.1tibble_3.0.6 crayon_1.4.0 pkgconfig_2.0.3 ellipsis_0.3.1 [46] prettyunits_1.1.1 httr_1.4.2assertthat_0.2.1 R6_2.5.0 compiler_4.0.3 19:05:52 > Thanks, Roger [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Out from an R package
Dear Goran, It's not clear from your question what you want to do, but my guess is that you simply what a "printout" of your results. The usual way to obtain that is via the summary() function. In your case summary(Output). That's typical of statistical modeling functions in R: They return objects, which can be used for further computing, rather than directly producing printouts. If my guess is correct, then you probably should learn more about statistical modeling in R, and about R in general, before using it in your work. One more thing: I doubt whether the command Output <- lmer(G10ln ~ v191_ms + (1 | couno), data = 'G10R') actually works. The data argument should be a data frame, not the *name* of a data frame, i.e., data = G10R . I hope this helps, John John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 2021-02-25 10:24 a.m., Göran Djurfeldt wrote: Help! I am going crazy for a very simple reason. I can’t access the output from for instance the lme4 package in R. I have been able to import an SPSS file into an R data frame. I have downloaded and installed the Lme4 package and I think I have also learnt how to produce a mixed model with lmer: Output <- lmer(G10ln ~ v191_ms + (1 | couno), data = 'G10R') How shall I define the output from lmer? What kind of object is it? How do I define it? Goran [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Different results on running Wilcoxon Rank Sum test in R and SPSS
Dear Bharat Rawlley, On 2021-01-20 1:45 p.m., bharat rawlley via R-help wrote: Dear Professor John, Thank you very much for your reply! I agree with you that the non-parametric tests I mentioned in my previous email (Moods median test and Median test) do not make sense in this situation as they treat PFD_n and drug_code as different groups. As you correctly said, I want to use PFD_n as a vector of scores and drug_code to make two groups out of it. This is exactly what the Independent samples median test does in SPSS. I wish to perform the same test in R and am unable to do so. Simply put, I am asking how to perform the Independent samples median test in R just like it is performed in SPSS? I'm afraid that I'm the wrong person to ask, since I haven't used SPSS in perhaps 30 years and have no idea what it does to test for differences in medians. A Google search for "independent samples median test in R" turns up a number of hits. Secondly, for the question you are asking about the test statistic, I have not performed the Wilcoxon Rank sum test in SPSS for the PFD_n and drug_code data. I have said something to the contrary in my first email, I apologize for that. For continuous data, the Wilcoxon test is, I believe, a reasonable choice, but not when there are so many ties. If SPSS doesn't perform a Wilcoxon test for a difference in medians, then there's of course no reason to expect that the p-values would be the same. Best, John Thank you very much for your time! Yours sincerelyBharat RawlleyOn Wednesday, 20 January, 2021, 04:47:21 am IST, John Fox wrote: Dear Bharat Rawlley, What you tried to do appears to be nonsense. That is, you're treating PFD_n and drug_code as if they were scores for two different groups. I assume that what you really want to do is to treat PFD_n as a vector of scores and drug_code as defining two groups. If that's correct, and with your data into Data, you can try the following: --snip -- > wilcox.test(PFD_n ~ drug_code, data=Data, conf.int=TRUE) Wilcoxon rank sum test with continuity correction data: PFD_n by drug_code W = 197, p-value = 0.05563 alternative hypothesis: true location shift is not equal to 0 95 percent confidence interval: -2.14e+00 5.037654e-05 sample estimates: difference in location -1.19 Warning messages: 1: In wilcox.test.default(x = c(27, 26, 20, 24, 28, 28, 27, 27, 26, : cannot compute exact p-value with ties 2: In wilcox.test.default(x = c(27, 26, 20, 24, 28, 28, 27, 27, 26, : cannot compute exact confidence intervals with ties --snip -- You can get an approximate confidence interval by specifying exact=FALSE: --snip -- > wilcox.test(PFD_n ~ drug_code, data=Data, conf.int=TRUE, exact=FALSE) Wilcoxon rank sum test with continuity correction data: PFD_n by drug_code W = 197, p-value = 0.05563 alternative hypothesis: true location shift is not equal to 0 95 percent confidence interval: -2.14e+00 5.037654e-05 sample estimates: difference in location -1.19 --snip -- As it turns out, your data are highly discrete and have a lot of ties (see in particular PFD_n = 28): --snip -- > xtabs(~ PFD_n + drug_code, data=Data) drug_code PFD_n 0 1 0 2 0 16 1 1 18 0 1 19 0 1 20 2 0 22 0 1 24 2 0 25 1 2 26 5 2 27 4 2 28 5 13 30 1 2 --snip -- I'm no expert in nonparametric inference, but I doubt whether the approximate p-value will be very accurate for data like these. I don't know why wilcox.test() (correctly used) and SPSS are giving you slightly different results -- assuming that you're actually doing the same thing in both cases. I couldn't help but notice that most of your data are missing. Are you getting the same value of the test statistic and different p-values, or is the test statistic different as well? I hope this helps, John John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 2021-01-19 5:46 a.m., bharat rawlley via R-help wrote: Thank you for the reply and suggestion, Michael! I used dput() and this is the output I can share with you. Simply explained, I have 3 columns namely, drug_code, freq4w_n and PFD_n. Each column has 132 values (including NA). The problem with the Wilcoxon Rank Sum test has been described in my first email. Please do let me know if you need any further clarification from my side! Thanks a lot for your time! structure(list(drug_code = c(0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1,
Re: [R] Different results on running Wilcoxon Rank Sum test in R and SPSS
Dear Bharat Rawlley, What you tried to do appears to be nonsense. That is, you're treating PFD_n and drug_code as if they were scores for two different groups. I assume that what you really want to do is to treat PFD_n as a vector of scores and drug_code as defining two groups. If that's correct, and with your data into Data, you can try the following: --snip -- > wilcox.test(PFD_n ~ drug_code, data=Data, conf.int=TRUE) Wilcoxon rank sum test with continuity correction data: PFD_n by drug_code W = 197, p-value = 0.05563 alternative hypothesis: true location shift is not equal to 0 95 percent confidence interval: -2.14e+00 5.037654e-05 sample estimates: difference in location -1.19 Warning messages: 1: In wilcox.test.default(x = c(27, 26, 20, 24, 28, 28, 27, 27, 26, : cannot compute exact p-value with ties 2: In wilcox.test.default(x = c(27, 26, 20, 24, 28, 28, 27, 27, 26, : cannot compute exact confidence intervals with ties --snip -- You can get an approximate confidence interval by specifying exact=FALSE: --snip -- > wilcox.test(PFD_n ~ drug_code, data=Data, conf.int=TRUE, exact=FALSE) Wilcoxon rank sum test with continuity correction data: PFD_n by drug_code W = 197, p-value = 0.05563 alternative hypothesis: true location shift is not equal to 0 95 percent confidence interval: -2.14e+00 5.037654e-05 sample estimates: difference in location -1.19 --snip -- As it turns out, your data are highly discrete and have a lot of ties (see in particular PFD_n = 28): --snip -- > xtabs(~ PFD_n + drug_code, data=Data) drug_code PFD_n 0 1 0 2 0 16 1 1 18 0 1 19 0 1 20 2 0 22 0 1 24 2 0 25 1 2 26 5 2 27 4 2 28 5 13 30 1 2 --snip -- I'm no expert in nonparametric inference, but I doubt whether the approximate p-value will be very accurate for data like these. I don't know why wilcox.test() (correctly used) and SPSS are giving you slightly different results -- assuming that you're actually doing the same thing in both cases. I couldn't help but notice that most of your data are missing. Are you getting the same value of the test statistic and different p-values, or is the test statistic different as well? I hope this helps, John John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 2021-01-19 5:46 a.m., bharat rawlley via R-help wrote: Thank you for the reply and suggestion, Michael! I used dput() and this is the output I can share with you. Simply explained, I have 3 columns namely, drug_code, freq4w_n and PFD_n. Each column has 132 values (including NA). The problem with the Wilcoxon Rank Sum test has been described in my first email. Please do let me know if you need any further clarification from my side! Thanks a lot for your time! structure(list(drug_code = c(0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 0, 0), freq4w_n = c(1, NA, NA, 0, NA, 4, NA, 10, NA, 0, 6, NA, NA, NA, NA, NA, 10, NA, 0, NA, NA, NA, NA, 0, NA, 0, NA, NA, NA, 0, NA, 0, NA, NA, NA, NA, NA, NA, NA, NA, 0, 0, 12, 0, NA, 1, 2, 1, 2, 2, NA, 28, 0, NA, 4, NA, 1, NA, NA, NA, NA, NA, 0, 3, 1, NA, NA, NA, NA, 4, 28, NA, NA, 0, 2, 12, 0, NA, NA, NA, 0, NA, 0, NA, NA, NA, NA, NA, NA, NA, NA, NA, 3, NA, NA, NA, NA, NA, NA, 6, 1, NA, NA, NA, 0, NA, NA, NA, 0, 0, NA, 0, NA, 2, 8, 3, NA, NA, NA, 0, NA, NA, NA, 9, NA, NA, NA, NA, NA, NA, NA, NA), PFD_n = c(27, NA, NA, 28, NA, 26, NA, 20, NA, 30, 24, NA, NA, NA, NA, NA, 18, NA, 28, NA, NA, NA, NA, 28, NA, 28, NA, NA, NA, 28, NA, 28, NA, NA, NA, NA, NA, NA, NA, NA, 28, 28, 16, 28, NA, 27, 26, 27, 26, 26, NA, 0, 30, NA, 24, NA, 27, NA, NA, NA, NA, NA, 28, 25, 27, NA, NA, NA, NA, 26, 0, NA, NA, 28, 26, 16, 28, NA, NA, NA, 28, NA, 28, NA, NA, NA, NA, NA, NA, NA, NA, NA, 25, NA, NA, NA, NA, NA, NA, 22, 27, NA, NA, NA, 28, NA, NA, NA, 28, 28, NA, 28, NA, 26, 20, 25, NA, NA, NA, 30, NA, NA, NA, 19, NA, NA, NA, NA, NA, NA, NA, NA)), row.names = c(NA, -132L), class = c("tbl_df", "tbl", "data.frame")) Yours sincerely Bharat RawlleyOn Tuesday, 19 January, 2021, 03:53:27 pm IST, Michael Dewey wrote: Unfortunately your data did not come through. Try using dput() and then pasting that into the body of your e-mail message. On 18/01/2021 17:26, bharat rawlley via R-help wrote: Hello, On running the Wilcoxon Rank Sum test in R and SPSS, I am getting the following discrepancies which I am un
Re: [R] Troubles installing Rcmdr on Mac
Dear Eberhard, On 2021-01-12 9:41 a.m., Dr Eberhard W Lisse wrote: John, maybe I misunderestimate the students :-)-O but there is not much sophistication required to follow simple instructions while thinking about what one is doing when doing so. At least that was my generation of students did. If they can install XQuartz, they can install the command line tools. In my experience, mostly with social-science undergraduates and graduate students, the more steps an installation requires, the more likely students will encounter difficulties. Your students may well have a different level of computer sophistication than mine did, but I know that my experience isn't unique. And, it's not about RCmdr but about the source packages that you may want to have to install for which the command line tools are required. Right. But all of the packages on which the Rcmdr package depends have Mac binaries. As I confirmed yesterday, one can install the Rcmdr and its dependencies on macOS without building any packages from source. Finally, RStudio is so much easier and more powerful, that I wonder why one is bothering with this including XQuartz. Because I think you underestimate the obstacle that working at the command line presents to students who often are already struggling to learn basic statistical concepts. In my, and others', experience, it's easier for students at this level to work with a statistical GUI than to write commands. While working n RStudio or another IDE is undoubtedly more powerful, for them it certainly isn't easier. Your teaching experiences may be different. Best, John greetings, el On 12/01/2021 16:19, John Fox wrote: Dear Eberhard, On 2021-01-12 12:32 a.m., Dr Eberhard W Lisse wrote: John, what is wrong with installing Xcode’s command lime tools (not Xcode itself)? Nothing, and I did miss the distinction, but it shouldn't be necessary, and the instructions for installing the Rcmdr are already more complicated on macOS than on other platforms because of the necessity to install XQuartz. Users should be able to install the Rcmdr package on macOS without having to install packages from source. Remember that Rcmdr users are typically students in basic statistics courses, many of whom have limited software sophistication. Unnecessarily complicating the installation is undesirable. Of course, if it's necessary to complicate the installation, one has to live with that. I'll be interested to learn whether my suggestions solve the problem. If not, I can add an instruction concerning the Xcode tools to the Rcmdr installation notes for macOS. Thanks for your help, John [...] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Troubles installing Rcmdr on Mac
Dear Stephane, On 2021-01-12 1:48 a.m., CHAMPELY STEPHANE wrote: Dear John, thank you for these responses, we will try this... today. We carefully read the installation notes, but it is sometimes difficult to really check what was done by the students because in France, our lessons are online lessons (covid-19...) Yes, the pandemic has made teaching very difficult. As a general matter, it's been my experience that almost all Rcmdr installation problems are on macOS. It's usually easy to help students get going in person, much less so remotely. Please let me know whether your students solve their problems, and if so, how, so that I can update the Rcmdr installation notes, if necessary. Also please keep the conversation on r-help so that others are able to follow it. Best, John All the best, Stéphane CHAMPELY Maître de conférences UFR STAPS , Laboratoire L-ViS, Université Lyon 1 De : John Fox Envoyé : mardi 12 janvier 2021 03:30 À : CHAMPELY STEPHANE Cc : r-help@r-project.org; Dr Eberhard W Lisse Objet : Re: [R] Troubles installing Rcmdr on Mac Dear Stephane, I've taken yet another look at this and have an additional suggestion for your students to try: install.packages("Rcmdr", type="mac.binary") That should avoid any attempt to install Rcmdr package dependencies from source. I hope this helps, John On 2021-01-11 3:53 p.m., John Fox wrote: Dear Stephane and Eberhard, As an addendum to my previous response, I uninstalled the Rcmdr package and all of its direct and indirect dependencies and then reinstalled the package -- on a macOS 11.1 system running R 4.0.3 with all other packages up-to-date. I then reinstalled the Rcmdr and dependencies via the command install.packages("Rcmdr"), and responded "no" when asked whether to install some packages from source (perhaps this is the explanation for the problem, if your students responded "yes" without having Xcode installed). Following these steps, everything (still) works fine. I therefore can't duplicate your students' problem, which makes it hard to suggest how to fix it, without having some additional details. Best, John On 2021-01-11 3:33 p.m., John Fox wrote: Dear Stephane and Eberhard, It should not be necessary to install Xcode (which includes otools) to install and use the Rcmdr package on macOS because it shouldn't be necessary to install the CRAN packages required from source. I'm currently running the Rcmdr on two macOS 11.1 systems, with all CRAN packages up-to-date, and don't have any problems. Stephane, have you and your students checked the Rcmdr installation notes (at <https://socialsciences.mcmaster.ca/jfox/Misc/Rcmdr/installation-notes.html>) and followed the instructions there? If you have, and still experience this problem, it would help to have some more information about what they did to install the Rcmdr and what happened. In the meantime, I'll try a fresh install of the Rcmdr and dependencies to see whether I encounter any difficulties. Best, John __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Troubles installing Rcmdr on Mac
Dear Eberhard, On 2021-01-12 12:32 a.m., Dr Eberhard W Lisse wrote: John, what is wrong with installing Xcode’s command lime tools (not Xcode itself)? Nothing, and I did miss the distinction, but it shouldn't be necessary, and the instructions for installing the Rcmdr are already more complicated on macOS than on other platforms because of the necessity to install XQuartz. Users should be able to install the Rcmdr package on macOS without having to install packages from source. Remember that Rcmdr users are typically students in basic statistics courses, many of whom have limited software sophistication. Unnecessarily complicating the installation is undesirable. Of course, if it's necessary to complicate the installation, one has to live with that. I'll be interested to learn whether my suggestions solve the problem. If not, I can add an instruction concerning the Xcode tools to the Rcmdr installation notes for macOS. Thanks for your help, John — Sent from Dr Lisse’s iPhone On 12 Jan 2021, 04:30 +0200, John Fox , wrote: Dear Stephane, I've taken yet another look at this and have an additional suggestion for your students to try: install.packages("Rcmdr", type="mac.binary") That should avoid any attempt to install Rcmdr package dependencies from source. I hope this helps, John On 2021-01-11 3:53 p.m., John Fox wrote: Dear Stephane and Eberhard, As an addendum to my previous response, I uninstalled the Rcmdr package and all of its direct and indirect dependencies and then reinstalled the package -- on a macOS 11.1 system running R 4.0.3 with all other packages up-to-date. I then reinstalled the Rcmdr and dependencies via the command install.packages("Rcmdr"), and responded "no" when asked whether to install some packages from source (perhaps this is the explanation for the problem, if your students responded "yes" without having Xcode installed). Following these steps, everything (still) works fine. I therefore can't duplicate your students' problem, which makes it hard to suggest how to fix it, without having some additional details. Best, John On 2021-01-11 3:33 p.m., John Fox wrote: Dear Stephane and Eberhard, It should not be necessary to install Xcode (which includes otools) to install and use the Rcmdr package on macOS because it shouldn't be necessary to install the CRAN packages required from source. I'm currently running the Rcmdr on two macOS 11.1 systems, with all CRAN packages up-to-date, and don't have any problems. Stephane, have you and your students checked the Rcmdr installation notes (at <https://socialsciences.mcmaster.ca/jfox/Misc/Rcmdr/installation-notes.html>) and followed the instructions there? If you have, and still experience this problem, it would help to have some more information about what they did to install the Rcmdr and what happened. In the meantime, I'll try a fresh install of the Rcmdr and dependencies to see whether I encounter any difficulties. Best, John __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Troubles installing Rcmdr on Mac
Dear Stephane, I've taken yet another look at this and have an additional suggestion for your students to try: install.packages("Rcmdr", type="mac.binary") That should avoid any attempt to install Rcmdr package dependencies from source. I hope this helps, John On 2021-01-11 3:53 p.m., John Fox wrote: Dear Stephane and Eberhard, As an addendum to my previous response, I uninstalled the Rcmdr package and all of its direct and indirect dependencies and then reinstalled the package -- on a macOS 11.1 system running R 4.0.3 with all other packages up-to-date. I then reinstalled the Rcmdr and dependencies via the command install.packages("Rcmdr"), and responded "no" when asked whether to install some packages from source (perhaps this is the explanation for the problem, if your students responded "yes" without having Xcode installed). Following these steps, everything (still) works fine. I therefore can't duplicate your students' problem, which makes it hard to suggest how to fix it, without having some additional details. Best, John On 2021-01-11 3:33 p.m., John Fox wrote: Dear Stephane and Eberhard, It should not be necessary to install Xcode (which includes otools) to install and use the Rcmdr package on macOS because it shouldn't be necessary to install the CRAN packages required from source. I'm currently running the Rcmdr on two macOS 11.1 systems, with all CRAN packages up-to-date, and don't have any problems. Stephane, have you and your students checked the Rcmdr installation notes (at <https://socialsciences.mcmaster.ca/jfox/Misc/Rcmdr/installation-notes.html>) and followed the instructions there? If you have, and still experience this problem, it would help to have some more information about what they did to install the Rcmdr and what happened. In the meantime, I'll try a fresh install of the Rcmdr and dependencies to see whether I encounter any difficulties. Best, John __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Troubles installing Rcmdr on Mac
Dear Stephane and Eberhard, As an addendum to my previous response, I uninstalled the Rcmdr package and all of its direct and indirect dependencies and then reinstalled the package -- on a macOS 11.1 system running R 4.0.3 with all other packages up-to-date. I then reinstalled the Rcmdr and dependencies via the command install.packages("Rcmdr"), and responded "no" when asked whether to install some packages from source (perhaps this is the explanation for the problem, if your students responded "yes" without having Xcode installed). Following these steps, everything (still) works fine. I therefore can't duplicate your students' problem, which makes it hard to suggest how to fix it, without having some additional details. Best, John On 2021-01-11 3:33 p.m., John Fox wrote: Dear Stephane and Eberhard, It should not be necessary to install Xcode (which includes otools) to install and use the Rcmdr package on macOS because it shouldn't be necessary to install the CRAN packages required from source. I'm currently running the Rcmdr on two macOS 11.1 systems, with all CRAN packages up-to-date, and don't have any problems. Stephane, have you and your students checked the Rcmdr installation notes (at <https://socialsciences.mcmaster.ca/jfox/Misc/Rcmdr/installation-notes.html>) and followed the instructions there? If you have, and still experience this problem, it would help to have some more information about what they did to install the Rcmdr and what happened. In the meantime, I'll try a fresh install of the Rcmdr and dependencies to see whether I encounter any difficulties. Best, John __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Troubles installing Rcmdr on Mac
Dear Stephane and Eberhard, It should not be necessary to install Xcode (which includes otools) to install and use the Rcmdr package on macOS because it shouldn't be necessary to install the CRAN packages required from source. I'm currently running the Rcmdr on two macOS 11.1 systems, with all CRAN packages up-to-date, and don't have any problems. Stephane, have you and your students checked the Rcmdr installation notes (at <https://socialsciences.mcmaster.ca/jfox/Misc/Rcmdr/installation-notes.html>) and followed the instructions there? If you have, and still experience this problem, it would help to have some more information about what they did to install the Rcmdr and what happened. In the meantime, I'll try a fresh install of the Rcmdr and dependencies to see whether I encounter any difficulties. Best, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 2021-01-11 1:55 p.m., Dr Eberhard W Lisse wrote: Use RStudio. But it can be that the command line tools are missing, which you (may) need to compile packages (from source).Ask one of them to open a terminal window and type the command ‘make —version’ without the ‘’) if that results in an error they need to enter ‘sudo xcode-select —install’ and then their password when asked.If that fixes the issue have all of them do that. el — Sent from Dr Lisse’s iPhone On 11 Jan 2021, 20:39 +0200, CHAMPELY STEPHANE , wrote: Dear colleagues, I try to help my (french) student since five days to install Rcmd for mac and they have (ALL of them, and I use windows so I am not very skilled for that task) the same problem. When they load Rcmd, some supplementary tools (in order to use the command "otool") are missing according a message and trying to download them leads to a message indicated that it is not at the present moment available (since Thursday last week...) So the menu of the Rcmdr are "white". Any idea where this technical problem come from ? Thank you fr any help ! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [effects] Wrong xlevels in effects plot for mixed effects model when multiline = TRUE
Dear Gerrit, The bug you reported should now be fixed in the development version 4.2-1 of the effects package, which you can currently install from R-Forge via install.packages("effects", repos="http://R-Forge.R-project.org";) . Eventually, the updated version of the effects package will be submitted to CRAN. Thank you again for the bug report, John John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 2020-11-09 4:51 p.m., Gerrit Eichner wrote: Dear John, thank you for prompt reply and your hints. The problem is that our lmer model is much more complicated and has several interaction terms: Mass ~ Sex + I(YoE - 1996) + I(PAI/0.1 - 16) + I(gProt/10 - 6.2) + I(Age/10 - 7.2) + I((Age/10 - 7.2)^2) + Diuretics + Sex:I(PAI/0.1 - 16) + Sex:I(gProt/10 - 6.2) + Sex:I(Age/10 - 7.2) + Sex:I((Age/10 - 7.2)^2) + I(YoE - 1996):I(Age/10 - 7.2) + I(PAI/0.1 - 16):I(Age/10 - 7.2) + I(gProt/10 - 6.2):I(Age/10 - 7.2) + (I(Age/10 - 7.2) + I((Age/10 - 7.2)^2) | ID) so that allEffects is quite efficient, and since I want to place several interaction terms with Age in one figure with Age on the horizontal axis the argument x.var = "Age" in plot would be very helpful. :-) Further hints using the above complex model: The following works well: eff <- Effect(c("gProt", "Age"), m, xlevels = list(gProt = 1:6 * 30, Age = 60:100)) plot(eff, lines=list(multiline=TRUE), x.var = "Age") But this fails (note that Age is missing in xlevels): eff <- Effect(c("gProt", "Age"), m, xlevels = list(gProt = 1:6 * 30)) plot(eff, lines=list(multiline=TRUE), x.var = "Age") And that just led me to a solutution also for allEffects: Specifying Age in xlevels for allEffects (although it seems unnecessary when x.var = "Age" is used in plot) produces the correct graphical output! :-) Thank you very much for your support and the brilliant effects package in general! :-) Best regards -- Gerrit - Dr. Gerrit Eichner Mathematical Institute, Room 212 gerrit.eich...@math.uni-giessen.de Justus-Liebig-University Giessen Tel: +49-(0)641-99-32104 Arndtstr. 2, 35392 Giessen, Germany Fax: +49-(0)641-99-32109 http://www.uni-giessen.de/eichner ----- Am 09.11.2020 um 19:51 schrieb John Fox: Dear Gerrit, This looks like a bug in plot.eff(), which I haven't yet tracked down, but the following should give you what you want: eff <- Effect(c("gProt", "Age"), m, xlevels = list(gProt = 1:6 * 30, Age=60:100)) plot(eff, lines=list(multiline=TRUE)) or eff <- predictorEffect("Age", m, xlevels = list(gProt = 1:6 * 30)) plot(eff, lines=list(multiline=TRUE)) A couple of comments on your code, unrelated to the bug in plot.eff(): You don't need allEffects() because there's only one high-order fixed effect in the model, I(gProt/10 - 6.2):I(Age/10 - 7.2) (i.e., the interaction of gProt with Age). x.var isn't intended as an argument for plot() with allEffects() because there generally isn't a common horizontal axis for all of the high-order effect plots. Finally, thank you for the bug report. Barring unforeseen difficulties, we'll fix the bug in due course. I hope this helps, John John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 2020-11-09 8:06 a.m., Gerrit Eichner wrote: Dear list members, I observe a strange/wrong graphical output when I set the xlevels in (e. g.) allEffects for an lmer model and plot the effects with multiline = TRUE. I have compiled a reprex for which you need the lmer model and the environment in which the model was fitted. They are contained in the zip file at https://jlubox.uni-giessen.de/dl/fiSzTCc3bW8z2npZvPpqG1xr/m-and-G1.zip After unpacking the following should work: m <- readRDS("m.rds") # The lmer-model. G1 <- readRDS("G1.rds") # Environment in which the model # was fitted; needed by alaEffects. summary(m) # Just to see the model. library(effects) aE <- allEffects(m, xlevels = list(gProt = 1:6 * 30)) # Non-default values for xlevels. plot(aE) # Fine. plot(aE, x.var = "Age") # Fine. plot(aE, lines = list(multiline = TRUE)) # Fine. plot(aE, lines = list(multiline = TRUE), x.var = "Age") # Nonsense. Anybody any idea about the reason, my mistake, or a workaround? Thx for any hint! Regards -- Gerrit PS: > sessionInfo() R version 4.0.2 (2020-06-22) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (buil
Re: [R] [effects] Wrong xlevels in effects plot for mixed effects model when multiline = TRUE
Dear Gerrit, This looks like a bug in plot.eff(), which I haven't yet tracked down, but the following should give you what you want: eff <- Effect(c("gProt", "Age"), m, xlevels = list(gProt = 1:6 * 30, Age=60:100)) plot(eff, lines=list(multiline=TRUE)) or eff <- predictorEffect("Age", m, xlevels = list(gProt = 1:6 * 30)) plot(eff, lines=list(multiline=TRUE)) A couple of comments on your code, unrelated to the bug in plot.eff(): You don't need allEffects() because there's only one high-order fixed effect in the model, I(gProt/10 - 6.2):I(Age/10 - 7.2) (i.e., the interaction of gProt with Age). x.var isn't intended as an argument for plot() with allEffects() because there generally isn't a common horizontal axis for all of the high-order effect plots. Finally, thank you for the bug report. Barring unforeseen difficulties, we'll fix the bug in due course. I hope this helps, John John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 2020-11-09 8:06 a.m., Gerrit Eichner wrote: Dear list members, I observe a strange/wrong graphical output when I set the xlevels in (e. g.) allEffects for an lmer model and plot the effects with multiline = TRUE. I have compiled a reprex for which you need the lmer model and the environment in which the model was fitted. They are contained in the zip file at https://jlubox.uni-giessen.de/dl/fiSzTCc3bW8z2npZvPpqG1xr/m-and-G1.zip After unpacking the following should work: m <- readRDS("m.rds") # The lmer-model. G1 <- readRDS("G1.rds") # Environment in which the model # was fitted; needed by alaEffects. summary(m) # Just to see the model. library(effects) aE <- allEffects(m, xlevels = list(gProt = 1:6 * 30)) # Non-default values for xlevels. plot(aE) # Fine. plot(aE, x.var = "Age") # Fine. plot(aE, lines = list(multiline = TRUE)) # Fine. plot(aE, lines = list(multiline = TRUE), x.var = "Age") # Nonsense. Anybody any idea about the reason, my mistake, or a workaround? Thx for any hint! Regards -- Gerrit PS: > sessionInfo() R version 4.0.2 (2020-06-22) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 18363) Matrix products: default locale: [1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252 [3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C [5] LC_TIME=German_Germany.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] effects_4.2-0 carData_3.0-4 loaded via a namespace (and not attached): [1] Rcpp_1.0.5 lattice_0.20-41 MASS_7.3-53 grid_4.0.2 DBI_1.1.0 [6] nlme_3.1-149 survey_4.0 estimability_1.3 minqa_1.2.4 nloptr_1.2.2.2 [11] Matrix_1.2-18 boot_1.3-25 splines_4.0.2 statmod_1.4.34 lme4_1.1-23 [16] tools_4.0.2 survival_3.2-3 yaml_2.2.1 compiler_4.0.2 colorspace_1.4-1 [21] mitools_2.4 insight_0.9.5 nnet_7.3-14 - Dr. Gerrit Eichner Mathematical Institute, Room 212 gerrit.eich...@math.uni-giessen.de Justus-Liebig-University Giessen Tel: +49-(0)641-99-32104 Arndtstr. 2, 35392 Giessen, Germany http://www.uni-giessen.de/eichner __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to get a numeric vector?
Dear vod vos, On 2020-10-04 6:47 p.m., vod vos via R-help wrote: Hi, a <- c(1, 4) b <- c(5, 8) a:b [1] 1 2 3 4 5 Warning messages: 1: In a:b : numerical expression has 2 elements: only the first used 2: In a:b : numerical expression has 2 elements: only the first used how to get: c(1:5, 4:8) The simplest way is c(1:5, 4:8) but I don't suppose that's what you really want. Perhaps the following is what you have in mind: > unlist(mapply(':', c(1, 4), c(5, 8), SIMPLIFY=FALSE)) [1] 1 2 3 4 5 4 5 6 7 8 In your case, but not more generally, > as.vector(mapply(':', c(1, 4), c(5, 8))) [1] 1 2 3 4 5 4 5 6 7 8 also works. I hope this helps, John John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ Thanks. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] formula wrangling
Dear Roger, This is an interesting puzzle and I started to look at it when your second message arrived. I can simplify your code slightly in two places, here: if (exists("fqssnames")) { mff <- m ffqss <- paste(fqssnames, collapse = "+") mff$formula <- as.formula(paste(deparse(Terms), "+", ffqss)) } and here: if (length(qssterms) > 0) { X <- do.call(cbind, c(list(X), lapply(tmpc$vars, function(u) eval(parse(text = u), mff } and the following line is extraneous: ef <- environment(formula) That doesn't amount to much, and I haven't tested my substitute code beyond your example. Best, John John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 2020-09-21 9:40 a.m., Koenker, Roger W wrote: Here is a revised snippet that seems to work the way that was intended. Apologies to anyone who wasted time looking at the original post. Of course my interest in simpler or more efficient solutions remains unabated. if (exists("fqssnames")) { mff <- m mff$formula <- Terms ffqss <- paste(fqssnames, collapse = "+") mff$formula <- as.formula(paste(deparse(mff$formula), "+", ffqss)) } m$formula <- Terms m <- eval(m, parent.frame()) mff <- eval(mff, parent.frame()) Y <- model.extract(m, "response") X <- model.matrix(Terms, m) ef <- environment(formula) qss <- function(x, lambda) (x^lambda - 1)/lambda if (length(qssterms) > 0) { xss <- lapply(tmpc$vars, function(u) eval(parse(text = u), mff)) for(i in 1:length(xss)){ X <- cbind(X, xss[[i]]) # Here is the problem } } On Sep 21, 2020, at 9:52 AM, Koenker, Roger W wrote: I need some help with a formula processing problem that arose from a seemingly innocuous request that I add a “subset” argument to the additive modeling function “rqss” in my quantreg package. I’ve tried to boil the relevant code down to something simpler as illustrated below. The formulae in question involve terms called “qss” that construct sparse matrix objects, but I’ve replaced all that with a much simpler BoxCox construction that I hope illustrates the basic difficulty. What is supposed to happen is that xss objects are evaluated and cbind’d to the design matrix, subject to the same subset restriction as the rest of the model frame. However, this doesn’t happen, instead the xss vectors are evaluated on the full sample and the cbind operation generates a warning which probably should be an error. I’ve inserted a browser() to make it easy to verify that the length of xss[[[1]] doesn’t match dim(X). Any suggestions would be most welcome, including other simplifications of the code. Note that the function untangle.specials() is adapted, or perhaps I should say adopted form the survival package so you would need the quantreg package to run the attached code. Thanks, Roger fit <- function(formula, subset, data, ...){ call <- match.call() m <- match.call(expand.dots = FALSE) tmp <- c("", "formula", "subset", "data") m <- m[match(tmp, names(m), nomatch = 0)] m[[1]] <- as.name("model.frame") Terms <- if(missing(data)) terms(formula,special = "qss") else terms(formula, special = "qss", data = data) qssterms <- attr(Terms, "specials")$qss if (length(qssterms)) { tmpc <- untangle.specials(Terms, "qss") dropx <- tmpc$terms if (length(dropx)) Terms <- Terms[-dropx] attr(Terms, "specials") <- tmpc$vars fnames <- function(x) { fy <- all.names(x[[2]]) if (fy[1] == "cbind") fy <- fy[-1] fy } fqssnames <- unlist(lapply(parse(text = tmpc$vars), fnames)) qssnames <- unlist(lapply(parse(text = tmpc$vars), function(x) deparse(x[[2]]))) } if (exists("fqssnames")) { ffqss <- paste(fqssnames, collapse = "+") ff <- as.formula(paste(deparse(formula), "+", ffqss)) } m$formula <- Terms m <- eval(m, parent.frame()) Y <- model.extract(m, "response") X <- model.matrix(Terms, m) ef <- environment(formula) qss <- function(x, lambda) (x^lambda - 1)/lambda if (length(qssterms) > 0) { xss <- lapply(tmpc$vars, function(u) eval(parse(text = u), m, enclos = ef)) for(i in 1:length(xss)){ X <- cbind(X, xss[[i]]) # Here is the problem } } browser() z <- lm.fit(X,Y) # The dreaded least squares fit z
Re: [R] linearHypothesis
Dear Johan, It's generally a good idea to keep the conversation on r-help to allow list members to follow it, and so I'm cc'ing this response to the list. I hope that it's clear that car::linearHypothesis() computes the test as a Wald test of a linear hypothesis and not as a likelihood-ratio test by model comparison. As your example illustrates, however, the two tests are the same for a linear model, but this is not true more generally. As I mentioned, you can find the details in many sources, including in Section 5.3.5 of Fox and Weisberg, An R Companion to Applied Regression, 3rd Edition, the book with which the car package is associated. Best, John On 2020-09-17 4:03 p.m., Johan Lassen wrote: Thank you John - highly appreciated! Yes, you are right, the less complex model may be seen as a restricted model of the starting model. Although the set of variables in the less complex model is not directly a subset of the variables of the starting model. What confused me at first was that I think of a subset model as a model having a direct subset of the set of variables of the starting model. Even though this is not the case in the example, the test still is on a restricted model of the starting model. Thanks, Johan Den tor. 17. sep. 2020 kl. 15.55 skrev John Fox <mailto:j...@mcmaster.ca>>: Dear Johan, On 2020-09-17 9:07 a.m., Johan Lassen wrote: > Dear R-users, > > I am using the R-function "linearHypothesis" to test if the sum of all > parameters, but the intercept, in a multiple linear regression is different > from zero. > I wonder if it is statistically valid to use the linearHypothesis-function > for this? Yes, assuming of course that the hypothesis makes sense. > Below is a reproducible example in R. A multiple regression: y = > beta0*t0+beta1*t1+beta2*t2+beta3*t3+beta4*t4 > > It seems to me that the linearHypothesis function does the calculation as > an F-test on the extra residuals when going from the starting model to a > 'subset' model, although all variables in the 'subset' model differ from > the variables in the starting model. > I normally think of a subset model as a model built on the same input data > as the starting model but one variable. > > Hence, is this a valid calculation? First, linearHypothesis() doesn't literally fit alternative models, but rather tests the linear hypothesis directly from the coefficient estimates and their covariance matrix. The test is standard -- look at the references in ?linearHypothesis or most texts on linear models. Second, formulating the hypothesis using alternative models is also legitimate, since the second model is a restricted version of the first. > > Thanks in advance,Johan > > # R-code: > y <- > c(101133190,96663050,106866486,97678429,83212348,75719714,77861937,74018478,82181104,68667176,64599495,62414401,63534709,58571865,65222727,60139788, > 63355011,57790610,55214971,55535484,55759192,49450719,48834699,51383864,51250871,50629835,52154608,54636478,54942637) > > data <- > data.frame(y,"t0"=1,"t1"=1990:2018,"t2"=c(rep(0,12),1:17),"t3"=c(rep(0,17),1:12),"t4"=c(rep(0,23),1:6)) > > model <- lm(y~t0+t1+t2+t3+t4+0,data=data) You need not supply the constant regressor t0 explicitly and suppress the intercept -- you'd get the same test from linearHypothesis() for lm(y~t1+t2+t3+t4,data=data). > > linearHypothesis(model,"t1+t2+t3+t4=0",test=c("F")) test = "F" is the default. > > # Reproduce the result from linearHypothesis: > # beta1+beta2+beta3+beta4=0 -> beta4=-(beta1+beta2+beta3) -> > # y=beta0+beta1*t1+beta2*t2+beta3*t3-(beta1+beta2+beta3)*t4 > # y = beta0'+beta1'*(t1-t4)+beta2'*(t2-t4)+beta3'*(t3-t4) > > data$t1 <- data$t1-data$t4 > data$t2 <- data$t2-data$t4 > data$t3 <- data$t3-data$t4 > > model_reduced <- lm(y~t0+t1+t2+t3+0,data=data) > > anova(model_reduced,model) Yes, this is equivalent to the test performed by linearHypothesis() using the coefficients and their covariances from the original model. I hope this helps, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ > -- Johan Lassen "In the cities people live in time - in the mountains people live in space" (Budistisk munk). -- John Fox, Professor Emeritus McMaster Unive
Re: [R] linearHypothesis
Dear Johan, On 2020-09-17 9:07 a.m., Johan Lassen wrote: Dear R-users, I am using the R-function "linearHypothesis" to test if the sum of all parameters, but the intercept, in a multiple linear regression is different from zero. I wonder if it is statistically valid to use the linearHypothesis-function for this? Yes, assuming of course that the hypothesis makes sense. Below is a reproducible example in R. A multiple regression: y = beta0*t0+beta1*t1+beta2*t2+beta3*t3+beta4*t4 It seems to me that the linearHypothesis function does the calculation as an F-test on the extra residuals when going from the starting model to a 'subset' model, although all variables in the 'subset' model differ from the variables in the starting model. I normally think of a subset model as a model built on the same input data as the starting model but one variable. Hence, is this a valid calculation? First, linearHypothesis() doesn't literally fit alternative models, but rather tests the linear hypothesis directly from the coefficient estimates and their covariance matrix. The test is standard -- look at the references in ?linearHypothesis or most texts on linear models. Second, formulating the hypothesis using alternative models is also legitimate, since the second model is a restricted version of the first. Thanks in advance,Johan # R-code: y <- c(101133190,96663050,106866486,97678429,83212348,75719714,77861937,74018478,82181104,68667176,64599495,62414401,63534709,58571865,65222727,60139788, 63355011,57790610,55214971,55535484,55759192,49450719,48834699,51383864,51250871,50629835,52154608,54636478,54942637) data <- data.frame(y,"t0"=1,"t1"=1990:2018,"t2"=c(rep(0,12),1:17),"t3"=c(rep(0,17),1:12),"t4"=c(rep(0,23),1:6)) model <- lm(y~t0+t1+t2+t3+t4+0,data=data) You need not supply the constant regressor t0 explicitly and suppress the intercept -- you'd get the same test from linearHypothesis() for lm(y~t1+t2+t3+t4,data=data). linearHypothesis(model,"t1+t2+t3+t4=0",test=c("F")) test = "F" is the default. # Reproduce the result from linearHypothesis: # beta1+beta2+beta3+beta4=0 -> beta4=-(beta1+beta2+beta3) -> # y=beta0+beta1*t1+beta2*t2+beta3*t3-(beta1+beta2+beta3)*t4 # y = beta0'+beta1'*(t1-t4)+beta2'*(t2-t4)+beta3'*(t3-t4) data$t1 <- data$t1-data$t4 data$t2 <- data$t2-data$t4 data$t3 <- data$t3-data$t4 model_reduced <- lm(y~t0+t1+t2+t3+0,data=data) anova(model_reduced,model) Yes, this is equivalent to the test performed by linearHypothesis() using the coefficients and their covariances from the original model. I hope this helps, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] new ivreg package for 2SLS regression with diagnostics
Dear list members, Christian Kleiber, Achim Zeileis, and I would like to announce a new CRAN package, ivreg, which provides a comprehensive implementation of instrumental variables estimation using two-stage least-squares (2SLS) regression. The standard regression functionality (parameter estimation, inference, robust covariances, predictions, etc.) in the package is derived from and supersedes the ivreg() function in the AER package. Additionally, various regression diagnostics are supported, including hat values, deletion diagnostics such as studentized residuals and Cook's distances; graphical diagnostics such as component-plus-residual plots and added-variable plots; and effect plots with partial residuals. In order to provide these features, the ivreg package integrates seamlessly with other packages through suitable S3 methods, specifically for generic functions in the base-R stats package, and in the car, effects, lmtest, and sandwich packages, among others. The ivreg package is accompanied by two online vignettes: a brief general introduction to the package, and an introduction to the regression diagnostics and graphics that are provided. For more information, see the ivreg CRAN webpage at <https://cran.r-project.org/package=ivreg> and the ivreg pkgdown webpage at <https://john-d-fox.github.io/ivreg/>. Comments, suggestions, and bug reports would be appreciated. John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ ___ R-packages mailing list r-packa...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to obtain individual log-likelihood value from glm?
Dear John, If you look at the code for logitreg() in the MASS text, you'll see that the casewise components of the log-likelihood are multiplied by the corresponding weights. As far as I can see, this only makes sense if the weights are binomial trials. Otherwise, while the coefficients themselves will be the same as obtained for proportionally similar integer weights (e.g., using your weights rather than weights/10), quantities such as the maximized log-likelihood, deviance, and coefficient standard errors will be uninterpretable. logitreg() is simply another way to compute the MLE, using a general-purpose optimizer rather than than iteratively weighted least-squares, which is what glm() uses. That the two functions provide the same answer within rounding error is unsurprising -- they're solving the same problem. A difference between the two functions is that glm() issues a warning about non-integer weights, while logitreg() doesn't. As I understand it, the motivation for writing logitreg() is to provide a function that could easily be modified, e.g., to impose parameter constraints on the solution. I think that this discussion has gotten unproductive. If you feel that proceeding with noninteger weights makes sense, for a reason that I don't understand, then you should go ahead. Best, John On 2020-08-29 1:23 p.m., John Smith wrote: In the book Modern Applied Statistics with S, 4th edition, 2002, by Venables and Ripley, there is a function logitreg on page 445, which does provide the weighted logistic regression I asked, judging by the loss function. And interesting enough, logitreg provides the same coefficients as glm in the example I provided earlier, even with weights < 1. Also for residual deviance, logitreg yields the same number as glm. Unless I misunderstood something, I am convinced that glm is a valid tool for weighted logistic regression despite the description on weights and somehow questionable logLik value in the case of non-integer weights < 1. Perhaps this is a bold claim: the description of weights can be modified and logLik can be updated as well. The stackexchange inquiry I provided is what I feel interesting, not the link in that post. Sorry for the confusion. On Sat, Aug 29, 2020 at 10:18 AM John Smith <mailto:jsw...@gmail.com>> wrote: Thanks for very insightful thoughts. What I am trying to achieve with the weights is actually not new, something like https://stats.stackexchange.com/questions/44776/logistic-regression-with-weighted-instances. I thought my inquiry was not too strange, and I could utilize some existing codes. It is just an optimization problem at the end of day, or not? Thanks On Sat, Aug 29, 2020 at 9:02 AM John Fox mailto:j...@mcmaster.ca>> wrote: Dear John, On 2020-08-29 1:30 a.m., John Smith wrote: > Thanks Prof. Fox. > > I am curious: what is the model estimated below? Nonsense, as Peter explained in a subsequent response to your prior posting. > > I guess my inquiry seems more complicated than I thought: with y being 0/1, how to fit weighted logistic regression with weights <1, in the sense of weighted least squares? Thanks What sense would that make? WLS is meant to account for non-constant error variance in a linear model, but in a binomial GLM, the variance is purely a function for the mean. If you had binomial (rather than binary 0/1) observations (i.e., binomial trials exceeding 1), then you could account for overdispersion, e.g., by introducing a dispersion parameter via the quasibinomial family, but that isn't equivalent to variance weights in a LM, rather to the error-variance parameter in a LM. I guess the question is what are you trying to achieve with the weights? Best, John > >> On Aug 28, 2020, at 10:51 PM, John Fox mailto:j...@mcmaster.ca>> wrote: >> >> Dear John >> >> I think that you misunderstand the use of the weights argument to glm() for a binomial GLM. From ?glm: "For a binomial GLM prior weights are used to give the number of trials when the response is the proportion of successes." That is, in this case y should be the observed proportion of successes (i.e., between 0 and 1) and the weights are integers giving the number of trials for each binomial observation. >> >> I hope this helps, >> John >> >> John Fox, Professor Emeritus >> McMaster University >> Hamilton, Ontario, Canada >> web: https://socialscie
Re: [R] How to obtain individual log-likelihood value from glm?
Dear John, On 2020-08-29 11:18 a.m., John Smith wrote: Thanks for very insightful thoughts. What I am trying to achieve with the weights is actually not new, something like https://stats.stackexchange.com/questions/44776/logistic-regression-with-weighted-instances. I thought my inquiry was not too strange, and I could utilize some existing codes. It is just an optimization problem at the end of day, or not? Thanks So the object is to fit a regularized (i.e, penalized) logistic regression rather than to fit by ML. glm() won't do that. I took a quick look at the stackexchange link that you provided and the document referenced in that link. The penalty proposed in the document is just a multiple of the sum of squared regression coefficients, what usually called an L2 penalty in the machine-learning literature. There are existing implementations of regularized logistic regression in R -- see the machine learning CRAN taskview <https://cran.r-project.org/web/views/MachineLearning.html>. I believe that the penalized package will fit a regularized logistic regression with an L2 penalty. As well, unless my quick reading was inaccurate, I think that you, and perhaps the stackexchange poster, might have been confused by the terminology used in the document: What's referred to as "weights" in the document is what statisticians more typically call "regression coefficients," and the "bias weight" is the "intercept" or "regression constant." Perhaps I'm missing some connection -- I'm not the best person to ask about machine learning. Best, John On Sat, Aug 29, 2020 at 9:02 AM John Fox wrote: Dear John, On 2020-08-29 1:30 a.m., John Smith wrote: Thanks Prof. Fox. I am curious: what is the model estimated below? Nonsense, as Peter explained in a subsequent response to your prior posting. I guess my inquiry seems more complicated than I thought: with y being 0/1, how to fit weighted logistic regression with weights <1, in the sense of weighted least squares? Thanks What sense would that make? WLS is meant to account for non-constant error variance in a linear model, but in a binomial GLM, the variance is purely a function for the mean. If you had binomial (rather than binary 0/1) observations (i.e., binomial trials exceeding 1), then you could account for overdispersion, e.g., by introducing a dispersion parameter via the quasibinomial family, but that isn't equivalent to variance weights in a LM, rather to the error-variance parameter in a LM. I guess the question is what are you trying to achieve with the weights? Best, John On Aug 28, 2020, at 10:51 PM, John Fox wrote: Dear John I think that you misunderstand the use of the weights argument to glm() for a binomial GLM. From ?glm: "For a binomial GLM prior weights are used to give the number of trials when the response is the proportion of successes." That is, in this case y should be the observed proportion of successes (i.e., between 0 and 1) and the weights are integers giving the number of trials for each binomial observation. I hope this helps, John John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 2020-08-28 9:28 p.m., John Smith wrote: If the weights < 1, then we have different values! See an example below. How should I interpret logLik value then? set.seed(135) y <- c(rep(0, 50), rep(1, 50)) x <- rnorm(100) data <- data.frame(cbind(x, y)) weights <- c(rep(1, 50), rep(2, 50)) fit <- glm(y~x, data, family=binomial(), weights/10) res.dev <- residuals(fit, type="deviance") res2 <- -0.5*res.dev^2 cat("loglikelihood value", logLik(fit), sum(res2), "\n") On Tue, Aug 25, 2020 at 11:40 AM peter dalgaard wrote: If you don't worry too much about an additive constant, then half the negative squared deviance residuals should do. (Not quite sure how weights factor in. Looks like they are accounted for.) -pd On 25 Aug 2020, at 17:33 , John Smith wrote: Dear R-help, The function logLik can be used to obtain the maximum log-likelihood value from a glm object. This is an aggregated value, a summation of individual log-likelihood values. How do I obtain individual values? In the following example, I would expect 9 numbers since the response has length 9. I could write a function to compute the values, but there are lots of family members in glm, and I am trying not to reinvent wheels. Thanks! counts <- c(18,17,15,20,10,20,25,13,12) outcome <- gl(3,1,9) treatment <- gl(3,3) data.frame(treatment, outcome, counts) # showing data glm.D93 <- glm(counts ~ outcome + treatment, family = poisson()) (ll <- logLik(glm.D93)) [[alternative HTML version deleted]]
Re: [R] How to obtain individual log-likelihood value from glm?
Dear John, On 2020-08-29 1:30 a.m., John Smith wrote: Thanks Prof. Fox. I am curious: what is the model estimated below? Nonsense, as Peter explained in a subsequent response to your prior posting. I guess my inquiry seems more complicated than I thought: with y being 0/1, how to fit weighted logistic regression with weights <1, in the sense of weighted least squares? Thanks What sense would that make? WLS is meant to account for non-constant error variance in a linear model, but in a binomial GLM, the variance is purely a function for the mean. If you had binomial (rather than binary 0/1) observations (i.e., binomial trials exceeding 1), then you could account for overdispersion, e.g., by introducing a dispersion parameter via the quasibinomial family, but that isn't equivalent to variance weights in a LM, rather to the error-variance parameter in a LM. I guess the question is what are you trying to achieve with the weights? Best, John On Aug 28, 2020, at 10:51 PM, John Fox wrote: Dear John I think that you misunderstand the use of the weights argument to glm() for a binomial GLM. From ?glm: "For a binomial GLM prior weights are used to give the number of trials when the response is the proportion of successes." That is, in this case y should be the observed proportion of successes (i.e., between 0 and 1) and the weights are integers giving the number of trials for each binomial observation. I hope this helps, John John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 2020-08-28 9:28 p.m., John Smith wrote: If the weights < 1, then we have different values! See an example below. How should I interpret logLik value then? set.seed(135) y <- c(rep(0, 50), rep(1, 50)) x <- rnorm(100) data <- data.frame(cbind(x, y)) weights <- c(rep(1, 50), rep(2, 50)) fit <- glm(y~x, data, family=binomial(), weights/10) res.dev <- residuals(fit, type="deviance") res2 <- -0.5*res.dev^2 cat("loglikelihood value", logLik(fit), sum(res2), "\n") On Tue, Aug 25, 2020 at 11:40 AM peter dalgaard wrote: If you don't worry too much about an additive constant, then half the negative squared deviance residuals should do. (Not quite sure how weights factor in. Looks like they are accounted for.) -pd On 25 Aug 2020, at 17:33 , John Smith wrote: Dear R-help, The function logLik can be used to obtain the maximum log-likelihood value from a glm object. This is an aggregated value, a summation of individual log-likelihood values. How do I obtain individual values? In the following example, I would expect 9 numbers since the response has length 9. I could write a function to compute the values, but there are lots of family members in glm, and I am trying not to reinvent wheels. Thanks! counts <- c(18,17,15,20,10,20,25,13,12) outcome <- gl(3,1,9) treatment <- gl(3,3) data.frame(treatment, outcome, counts) # showing data glm.D93 <- glm(counts ~ outcome + treatment, family = poisson()) (ll <- logLik(glm.D93)) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd@cbs.dk Priv: pda...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to obtain individual log-likelihood value from glm?
Dear John I think that you misunderstand the use of the weights argument to glm() for a binomial GLM. From ?glm: "For a binomial GLM prior weights are used to give the number of trials when the response is the proportion of successes." That is, in this case y should be the observed proportion of successes (i.e., between 0 and 1) and the weights are integers giving the number of trials for each binomial observation. I hope this helps, John John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 2020-08-28 9:28 p.m., John Smith wrote: If the weights < 1, then we have different values! See an example below. How should I interpret logLik value then? set.seed(135) y <- c(rep(0, 50), rep(1, 50)) x <- rnorm(100) data <- data.frame(cbind(x, y)) weights <- c(rep(1, 50), rep(2, 50)) fit <- glm(y~x, data, family=binomial(), weights/10) res.dev <- residuals(fit, type="deviance") res2 <- -0.5*res.dev^2 cat("loglikelihood value", logLik(fit), sum(res2), "\n") On Tue, Aug 25, 2020 at 11:40 AM peter dalgaard wrote: If you don't worry too much about an additive constant, then half the negative squared deviance residuals should do. (Not quite sure how weights factor in. Looks like they are accounted for.) -pd On 25 Aug 2020, at 17:33 , John Smith wrote: Dear R-help, The function logLik can be used to obtain the maximum log-likelihood value from a glm object. This is an aggregated value, a summation of individual log-likelihood values. How do I obtain individual values? In the following example, I would expect 9 numbers since the response has length 9. I could write a function to compute the values, but there are lots of family members in glm, and I am trying not to reinvent wheels. Thanks! counts <- c(18,17,15,20,10,20,25,13,12) outcome <- gl(3,1,9) treatment <- gl(3,3) data.frame(treatment, outcome, counts) # showing data glm.D93 <- glm(counts ~ outcome + treatment, family = poisson()) (ll <- logLik(glm.D93)) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd@cbs.dk Priv: pda...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] install.packages() R vs RStudio
Hi Duncan, What you say is entirely sensible. Yes, it's primarily the silent part that seems problematic to me. Messages about masking are uninteresting until one encounters a problem, and then they may provide an important clue to the source of the problem. As to this specific case: It's not clear to me why it's necessary or even desirable for RStudio to mask utils::install.packages(). After all RStudio provides an alternative route to package installation via the Packages tab, and it wouldn't have been hard to name the function something different from install.packages() to provide additional functionality via direct commands. Best, John On 2020-08-17 3:15 p.m., Duncan Murdoch wrote: Hi John. I suspect most good front ends do similar things. For example, on MacOS, R.app messes up "history()". I've never used ESS, but I imagine one could find examples where it acts differently than base R: isn't that the point? One hopes all differences are improvements, but sometimes they're not. If the modifications cause trouble (e.g. the ones you and I have never experienced with install.packages() in RStudio, or the one I experience every now and then with history() in R.app), then that may be a bug in the front-end. It should be reported to the authors. R is designed to be flexible, and to let people change its behaviour. Using that flexibility is what all users should do. Improving the user experience is what front-end writers should do. I don't find it inadvisable at all. If it's the "silent" part that you object to, I think that's a matter of taste. Personally, I've stopped reading the messages like "Attaching package: ‘zoo’ The following objects are masked from ‘package:base’: as.Date, as.Date.numeric" so they may as well be silent. Duncan Murdoch On 17/08/2020 10:02 a.m., John Fox wrote: Dear Duncan, On 2020-08-17 9:03 a.m., Duncan Murdoch wrote: On 17/08/2020 7:54 a.m., Ivan Calandra wrote: Dear useRs, Following the recent activity on the list, I have been made aware of this discussion: https://stat.ethz.ch/pipermail/r-help/2020-May/466788.html I used to install all packages in R, but for simplicity (I use RStudio for all purposes), I now do it in RStudio. Now I am left wondering whether I should continue installing packages directly from RStudio or whether I should revert to using R. My goal is not to flare a debate over whether RStudio is better or worse than R, but rather simply to understand whether there are differences and potential issues (that could lead to problems in code) about installing packages through RStudio. In general, it would be nice to have a list of the differences in behavior between R and RStudio, but I believe this should come from the RStudio side of things. Thank you all for the insights. Ivan To see the install.packages function that RStudio installs, just type its name: > install.packages function (...) .rs.callAs(name, hook, original, ...) You can debug it to see the other variables: > debug(install.packages) > install.packages("abind") debugging in: install.packages("abind") debug: .rs.callAs(name, hook, original, ...) Browse[2]> name [1] "install.packages" Browse[2]> hook function (original, pkgs, lib, repos = getOption("repos"), ...) { if (missing(pkgs)) return(utils::install.packages()) if (!.Call("rs_canInstallPackages", PACKAGE = "(embedding)")) { stop("Package installation is disabled in this version of RStudio", call. = FALSE) } packratMode <- !is.na(Sys.getenv("R_PACKRAT_MODE", unset = NA)) if (!is.null(repos) && !packratMode && .rs.loadedPackageUpdates(pkgs)) { installCmd <- NULL for (i in seq_along(sys.calls())) { if (identical(deparse(sys.call(i)[[1]]), "install.packages")) { installCmd <- gsub("\\s+", " ", paste(deparse(sys.call(i)), collapse = " ")) break } } .rs.enqueLoadedPackageUpdates(installCmd) stop("Updating loaded packages") } .rs.addRToolsToPath() on.exit({ .rs.updatePackageEvents() .Call("rs_packageLibraryMutated", PACKAGE = "(embedding)") .rs.restorePreviousPath() }) original(pkgs, lib, repos, ...) } The .rs.callAs function just substitutes the call to "hook" for the call to the original install.packages. So you can see that they do the following: - they allow a way to disable installing packages, - they support "packrat" (a system for installing particular versions of packages, see https://github.com/rstudio/packrat), -
Re: [R] install.packages() R vs RStudio
Dear Duncan, On 2020-08-17 9:03 a.m., Duncan Murdoch wrote: On 17/08/2020 7:54 a.m., Ivan Calandra wrote: Dear useRs, Following the recent activity on the list, I have been made aware of this discussion: https://stat.ethz.ch/pipermail/r-help/2020-May/466788.html I used to install all packages in R, but for simplicity (I use RStudio for all purposes), I now do it in RStudio. Now I am left wondering whether I should continue installing packages directly from RStudio or whether I should revert to using R. My goal is not to flare a debate over whether RStudio is better or worse than R, but rather simply to understand whether there are differences and potential issues (that could lead to problems in code) about installing packages through RStudio. In general, it would be nice to have a list of the differences in behavior between R and RStudio, but I believe this should come from the RStudio side of things. Thank you all for the insights. Ivan To see the install.packages function that RStudio installs, just type its name: > install.packages function (...) .rs.callAs(name, hook, original, ...) You can debug it to see the other variables: > debug(install.packages) > install.packages("abind") debugging in: install.packages("abind") debug: .rs.callAs(name, hook, original, ...) Browse[2]> name [1] "install.packages" Browse[2]> hook function (original, pkgs, lib, repos = getOption("repos"), ...) { if (missing(pkgs)) return(utils::install.packages()) if (!.Call("rs_canInstallPackages", PACKAGE = "(embedding)")) { stop("Package installation is disabled in this version of RStudio", call. = FALSE) } packratMode <- !is.na(Sys.getenv("R_PACKRAT_MODE", unset = NA)) if (!is.null(repos) && !packratMode && .rs.loadedPackageUpdates(pkgs)) { installCmd <- NULL for (i in seq_along(sys.calls())) { if (identical(deparse(sys.call(i)[[1]]), "install.packages")) { installCmd <- gsub("\\s+", " ", paste(deparse(sys.call(i)), collapse = " ")) break } } .rs.enqueLoadedPackageUpdates(installCmd) stop("Updating loaded packages") } .rs.addRToolsToPath() on.exit({ .rs.updatePackageEvents() .Call("rs_packageLibraryMutated", PACKAGE = "(embedding)") .rs.restorePreviousPath() }) original(pkgs, lib, repos, ...) } The .rs.callAs function just substitutes the call to "hook" for the call to the original install.packages. So you can see that they do the following: - they allow a way to disable installing packages, - they support "packrat" (a system for installing particular versions of packages, see https://github.com/rstudio/packrat), - they add RTools to the path (presumably only on Windows) - they call the original function, and at the end update internal variables so they can show the library in the Packages pane. So there is no reason not to do it in R. By the way, saying that this is a "modified version of R" is like saying every single user who defines a variable creates a modified version of R. If you type "x" in the plain R console, you see "Error: object 'x' not found". If you "modify" R by assigning a value to x, you'll see something different. Very scary! I can't recall ever disagreeing with something you said on the R-help, but this seems to me to be off-base. While what you say is technically correct, silently masking a standard R function, in this case, I believe, by messing with the namespace of the utils package, seems inadvisable to me. As has been noted, cryptic problems have arisen with install.packages() in RStudio -- BTW, I use it regularly and haven't personally experienced any issues. One could concoct truly scary examples, such as redefining isTRUE(). Best, John Duncan Murdoch __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Best settings for RStudio video recording?
Hi, I had occasion last month to teach a two-week, two-hour-per-day lecture series on R via Zoom for the ICPSR Summer Program -- the website for the lectures is at <https://socialsciences.mcmaster.ca/jfox/Courses/R/ICPSR/index.html>. I used RStudio and mostly displayed my desktop via one monitor in a two-monitor setup. That allowed me to show the website (or Canvas site) for the lectures, PDF slides, or the RStudio window, and to have the other monitor free to control the Zoom session. Most of the time, perhaps 1.5 hours per session, I displayed the RStudio window. To set the size of the fonts in RStudio, I tested in a dummy Zoom session that I viewed on a small laptop prior to the start of the lecture series. I hope this helps, John John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 2020-08-14 4:29 a.m., peter dalgaard wrote: [Sorry about the misfire a second ago...] As others have said, for deeper questions, try RStudio's own lists or R-sig-teaching. However, FWIW, I seem to have gotten away with just using a separate virtual desktop with my usual work setup, and then switch to it when necessary. This was for Panopto video recordings, but Zoom et al. should be much the same. Compared to physical lecturing it is actually somewhat easier, because you don't need to worry so much about projector shortcomings, readability from the back row, etc. -pd On 13 Aug 2020, at 20:58 , Jonathan Greenberg wrote: Folks: I was wondering if you all would suggest some helpful RStudio configurations that make recording a session via e.g. zoom the most useful for students doing remote learning. Thoughts? --j -- Jonathan A. Greenberg, PhD Randall Endowed Professor and Associate Professor of Remote Sensing Global Environmental Analysis and Remote Sensing (GEARS) Laboratory Natural Resources & Environmental Science University of Nevada, Reno 1664 N Virginia St MS/0186 Reno, NV 89557 Phone: 415-763-5476 https://www.gearslab.org/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Dependent Variable in Logistic Regression
Dear Paul, I think that this thread has gotten unnecessarily complicated. The answer, as is easily demonstrated, is that a binary response for a binomial GLM in glm() may be a factor, a numeric variable, or a logical variable, with identical results; for example: --- snip - > set.seed(123) > head(x <- rnorm(100)) [1] -0.56047565 -0.23017749 1.55870831 0.07050839 0.12928774 1.71506499 > head(y <- rbinom(100, 1, 1/(1 + exp(-x [1] 0 1 1 1 1 0 > head(yf <- as.factor(y)) [1] 0 1 1 1 1 0 Levels: 0 1 > head(yl <- y == 1) [1] FALSE TRUE TRUE TRUE TRUE FALSE > glm(y ~ x, family=binomial) Call: glm(formula = y ~ x, family = binomial) Coefficients: (Intercept)x 0.3995 1.1670 Degrees of Freedom: 99 Total (i.e. Null); 98 Residual Null Deviance: 134.6 Residual Deviance: 114.9AIC: 118.9 > glm(yf ~ x, family=binomial) Call: glm(formula = yf ~ x, family = binomial) Coefficients: (Intercept)x 0.3995 1.1670 Degrees of Freedom: 99 Total (i.e. Null); 98 Residual Null Deviance: 134.6 Residual Deviance: 114.9AIC: 118.9 > glm(yl ~ x, family=binomial) Call: glm(formula = yl ~ x, family = binomial) Coefficients: (Intercept)x 0.3995 1.1670 Degrees of Freedom: 99 Total (i.e. Null); 98 Residual Null Deviance: 134.6 Residual Deviance: 114.9AIC: 118.9 --- snip - The original poster claimed to have encountered an error with a 0/1 numeric response, but didn't show any data or even a command. I suspect that the response was a character variable, but of course can't really know that. Best, John John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 2020-08-01 2:25 p.m., Paul Bernal wrote: Dear friend, I am aware that I have a binomial dependent variable, which is covid status (1 if covid positive, and 0 otherwise). My question was if R requires to turn a binomial response variable into a factor or not, that's all. Cheers, Paul El sáb., 1 de agosto de 2020 1:22 p. m., Bert Gunter escribió: ... yes, but so does lm() for a categorical **INdependent** variable with more than 2 numerically labeled levels. n levels = (n-1) df for a categorical covariate, but 1 for a continuous one (unless more complex models are explicitly specified of course). As I said, the OP seems confused about whether he is referring to the response or covariates. Or maybe he just made the same typo I did. Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Sat, Aug 1, 2020 at 11:15 AM Patrick (Malone Quantitative) < mal...@malonequantitative.com> wrote: No, R does not. glm() does in order to do logistic regression. On Sat, Aug 1, 2020 at 2:11 PM Paul Bernal wrote: Hi Bert, Thank you for the kind reply. But what if I don't turn the variable into a factor. Let's say that in excel I just coded the variable as 1s and 0s and just imported the dataset into R and fitted the logistic regression without turning any categorical variable or dummy variable into a factor? Does R requires every dummy variable to be treated as a factor? Best regards, Paul El sáb., 1 de agosto de 2020 12:59 p. m., Bert Gunter < bgunter.4...@gmail.com> escribió: x <- factor(0:1) x <- factor("yes","no") will produce identical results up to labeling. Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Sat, Aug 1, 2020 at 10:40 AM Paul Bernal wrote: Dear friends, Hope you are doing great. I want to fit a logistic regression in R, where the dependent variable is the covid status (I used 1 for covid positives, and 0 for covid negatives), but when I ran the glm, R complains that I should make the dependent variable a factor. What would be more advisable, to keep the dependent variable with 1s and 0s, or code it as yes/no and then make it a factor? Any guidance will be greatly appreciated, Best regards, Paul [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide ht
Re: [R] Axis with inverse logarithmic scale
Dear Martin, On 7/28/2020 10:17 AM, Martin Maechler wrote: Martin Maechler on Tue, 28 Jul 2020 15:56:10 +0200 writes: John Fox on Mon, 27 Jul 2020 12:57:57 -0400 writes: >> Dear Dileepkumar R, >> As is obvious from the tick marks, the vertical axis is not log-scaled: >>> log10(99.999) - log10(99.99) >> [1] 3.908865e-05 >>> log10(99) - log10(90) >> [1] 0.04139269 >> That is, these (approximately?) equally spaced ticks aren't equally >> spaced on the log scale. >> The axis is instead apparently (at least approximately) on the logit >> (log-odds) scale: >>> library(car) >> Loading required package: carData >>> logit(99.999) - logit(99.99) >> [1] 2.302675 >>> logit(99) - logit(90) >> [1] 2.397895 > Small remark : You don't need car (or any other extra pkg) to have logit: > logit <- plogis # is sufficient > Note that the ?plogis (i.e. 'Logistic') help page has had a > \concept{logit} > entry (which would help if one used help.search() .. {I don't; > I have 1 of packages}), > and that same help page has been talking about 'logit' for ca 16 > years now (and I'm sure this is news for most readers, still)... but now I see that car uses the "empirical logit" function, where plogis() provides the mathematical logit(): Not quite the empirical logit, because we don't know the counts, but a similar idea when the proportions include 0 or 1. Also, logit() recognizes percents as well as proportions, and so there's no need to convert the former to the latter. The former is typically needed for data transformations where you don't want to map {0,1} to -/+ Inf but rather to finite values .. So I should stayed quiet, probably.. Well, I wouldn't go so far as that. Best, John Martin >> You can get a graph close to the one you shared via the following: >> library(car) # repeated so you don't omit it > .. and here you need 'car' for the nice probabilityAxis(.) .. >>> logits <- logit(y_values) >>> plot(x_value, logits, log="x", axes=FALSE, >> + xlim=c(1, 200), ylim=logit(c(10, 99.999)), >> + xlab="Precipitation Intensity (mm/d)", >> + ylab="Cumulative Probability", >> + main="Daily U.S. Precipitation", >> + col="magenta") >>> axis(1, at=c(1, 2, 5, 10, 20, 50, 100, 200)) >>> probabilityAxis(side=2, at=c(10, 30, 50, 90, 99, 99.9, 99.99, >> 99.999)/100) >>> box() >> Comments: >> This produces probabilities, not percents, on the vertical axis, which >> conforms to what the axis label says. Also, the ticks in the R version >> point out rather than into the plotting region -- the former is >> generally considered better practice. Finally, the graph is not a >> histogram as the original title states. >> I hope this helps, >> John >> >> John Fox >> Professor Emeritus >> McMaster University >> Hamilton, Ontario, Canada >> web: https://socialsciences.mcmaster.ca/jfox/ >> On 7/27/2020 11:56 AM, Dileepkumar R wrote: >>> I think the attached sample figure is not visible >>> Here is the sample figure: >>> https://drive.google.com/file/d/16Uy3JD0wsEucUv_KOhXCxLZ4U-3wiBTs/view?usp=sharing >>> >>> sincerely, >>> >>> >>> Dileepkumar R >>> >>> >>> >>> >>> On Mon, Jul 27, 2020 at 7:13 PM Dileepkumar R >>> wrote: >>> >>>> Dear All, >>>> >>>> I want to plot a simple cumulative probability distribution graph with >>>> like the attached screenshot. >>>> But I couldn't fix the y-axis scale as in that screenshot. >>>> >>>> My data details are follows: >>>> >>>> y_values >>>> =c(66.78149,76.10846,81.65518,85.06448,87.61703,89.61314,91.20297,92.36884, >>>> 93.64070,94.57693,95.23052,95.75163,96.15792,96.58188,96.97933,97.29730, >>>> 97.59760,97.91556,98.14520,98.37485,98.57799,98.74580,98.87829,99.06377, >>>> 99.16093,99.25808,99.37290,99.45239,99.54072,99.59371,99.62904,99.6643
Re: [R] Axis with inverse logarithmic scale
Dear Dileepkumar R, As is obvious from the tick marks, the vertical axis is not log-scaled: > log10(99.999) - log10(99.99) [1] 3.908865e-05 > log10(99) - log10(90) [1] 0.04139269 That is, these (approximately?) equally spaced ticks aren't equally spaced on the log scale. The axis is instead apparently (at least approximately) on the logit (log-odds) scale: > library(car) Loading required package: carData > logit(99.999) - logit(99.99) [1] 2.302675 > logit(99) - logit(90) [1] 2.397895 You can get a graph close to the one you shared via the following: library(car) # repeated so you don't omit it > logits <- logit(y_values) > plot(x_value, logits, log="x", axes=FALSE, + xlim=c(1, 200), ylim=logit(c(10, 99.999)), + xlab="Precipitation Intensity (mm/d)", + ylab="Cumulative Probability", + main="Daily U.S. Precipitation", + col="magenta") > axis(1, at=c(1, 2, 5, 10, 20, 50, 100, 200)) > probabilityAxis(side=2, at=c(10, 30, 50, 90, 99, 99.9, 99.99, 99.999)/100) > box() Comments: This produces probabilities, not percents, on the vertical axis, which conforms to what the axis label says. Also, the ticks in the R version point out rather than into the plotting region -- the former is generally considered better practice. Finally, the graph is not a histogram as the original title states. I hope this helps, John John Fox Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 7/27/2020 11:56 AM, Dileepkumar R wrote: I think the attached sample figure is not visible Here is the sample figure: https://drive.google.com/file/d/16Uy3JD0wsEucUv_KOhXCxLZ4U-3wiBTs/view?usp=sharing sincerely, Dileepkumar R On Mon, Jul 27, 2020 at 7:13 PM Dileepkumar R wrote: Dear All, I want to plot a simple cumulative probability distribution graph with like the attached screenshot. But I couldn't fix the y-axis scale as in that screenshot. My data details are follows: y_values =c(66.78149,76.10846,81.65518,85.06448,87.61703,89.61314,91.20297,92.36884, 93.64070,94.57693,95.23052,95.75163,96.15792,96.58188,96.97933,97.29730, 97.59760,97.91556,98.14520,98.37485,98.57799,98.74580,98.87829,99.06377, 99.16093,99.25808,99.37290,99.45239,99.54072,99.59371,99.62904,99.66437, 99.69970,99.70853,99.72620,99.73503,99.77036,99.79686,99.80569,99.82335, 99.83219,99.84985,99.86751,99.87635,99.87635,99.90284,99.90284,99.90284, 99.91168,99.92051,99.92051,99.93817,99.93817,99.93817,99.95584,99.95584, 99.97350,99.97350,99.97350,99.97350,99.97350,99.97350,99.97350) x_value=seq(63) Thank you all in advance Dileepkumar R [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error message from allEffects(model) / effect(model)" ‘range’ not meaningful for factors"
Hi, > m <- lm(y~x) # no problem > allEffects(m)# also no problem model: y ~ x x effect x abc 3.322448 3.830997 4.969154 > effect("x", m) # ditto x effect x abc 3.322448 3.830997 4.969154 > Effect("x", m) # ditto x effect x abc 3.322448 3.830997 4.969154 Best, John ------- John Fox, Professor McMaster University Hamilton, Ontario, Canada http://socserv.socsci.mcmaster.ca/jfox/ > -Original Message- > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Robert > Zimbardo > Sent: Tuesday, August 18, 2015 8:50 PM > To: r-help@r-project.org > Subject: [R] Error message from allEffects(model) / effect(model)" > ‘range’ not meaningful for factors" > > Hi > > I cannot figure out why the effects package throws me error messages > with the following simple code: > > > rm(list=ls(all=TRUE)); set.seed(1); library(effects) > # set up data > x <- factor(rep(letters[1:3], each=100)) > y <- c(rnorm(100, 3, 3), rnorm(100, 4, 3), rnorm(100, 5, 3)) > > > # fit linear model > m <- summary(lm(y~x)) # no problem > > # now the problem > plot(allEffects(m)) > # Error in Summary.factor(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, > : > # ‘range’ not meaningful for factors > plot(effect("x", m)) > # Error in Summary.factor(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, > : > # ‘range’ not meaningful for factors > > > Any ideas? It's go to be something superobvious, but I don't get it. > Thanks, > RZ > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R GUI tklistbox get value
Dear j.para.fernandez, Try selecvar <- dat[, as.numeric(tkcurselection(tl))+1] Omitting the comma returns a one-column data frame, not a numeric vector. I hope this helps, John ---- John Fox, Professor McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox/ On Mon, 20 Jul 2015 03:29:07 -0700 (PDT) jpara3 wrote: > Hi, i have a dataframe, dat, with 2 variables, one and two. > > I want to print in R the mean of the selected variable of the dataframe. You > can select it with a tklistbox, but when you click OK button, the mean is > not displayed, just NA > > > > > > one<-c(5,5,6,9,5,8) > two<-c(12,13,14,12,14,12) > dat<-data.frame(uno,dos) > > require(tcltk) > tt<-tktoplevel() > tl<-tklistbox(tt,height=4,selectmode="single") > tkgrid(tklabel(tt,text="Selecciona la variable para calcular media")) > tkgrid(tl) > for (i in (1:4)) > { > tkinsert(tl,"end",colnames(dat[i])) > } > > OnOK <- function() > { > > selecvar <- dat[as.numeric(tkcurselection(tl))+1] > > print(mean(selecvar)) > > } > OK.but <-tkbutton(tt,text=" OK ",command=OnOK) > tkgrid(OK.but) > tkfocus(tt) > > # > > Can someone please help me?? Thanks!!! > > > > -- > View this message in context: > http://r.789695.n4.nabble.com/R-GUI-tklistbox-get-value-tp4710064.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] powerTransform warning message?
Dear Brittany, On Thu, 16 Jul 2015 17:35:38 -0600 Brittany Demmitt wrote: > Hello, > > I have a series of 40 variables that I am trying to transform via the boxcox > method using the powerTransfrom function in R. I have no zero values in any > of my variables. When I run the powerTransform function on the full data set > I get the following warning. > > Warning message: > In sqrt(diag(solve(res$hessian))) : NaNs produced > > However, when I analyze the variables in groups, rather than all 40 at a time > I do not get this warning message. Why would this be? And does this mean > this warning is safe to ignore? > No, it is not safe to ignore the warning, and the problem has nothing to do with non-positive values in the data -- when you say that there are no 0s in the data, I assume that you mean that the data values are all positive. The square-roots of the diagonal entries of the Hessian at the (pseudo-) ML estimates are the SEs of the estimated transformation parameters. If the Hessian can't be inverted, that usually implies that the maximum of the (pseudo-) likelihood isn't well defined. This isn't surprising when you're trying to transform as many as 40 variables at a time to multivariate normality. It's my general experience that people often throw their data into the Box-Cox black box and hope for the best without first examining the data, and, e.g., insuring a reasonable ratio of maximum/minimum values for each variable, checking for extreme outliers, etc. Of course, I don't know that you did that, and it's perfectly possible that you were careful. > I would like to add that all of my lambda values are in the -5 to 5 range. I > also get different lambda values when I analyze the variables together versus > in groups. Is this to be expected? > Yes. It's very unlikely that both are right. If, e.g., the variables are multivariate normal within groups then their marginal distribution is a mixture of multivariate normals, which almost surely isn't itself normal. I hope this helps, John John Fox, Professor McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox/ > Thank you so much! > > Brittany > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plot in Rcmdr
Dear David and Joanne, David, thank you for answering Joanne's question before I saw it. The help page for car::scatterplot() is also accessible via the Help button in the Rcmdr scatterplot dialog. I'll think about whether to add a control for legend position to the scatterplot dialog. There are already some enhancements to the dialog in the forthcoming version 2.2-0 of the Rcmdr package, due late this summer, but I try not to make the dialogs too complicated. Best, John ------- John Fox, Professor McMaster University Hamilton, Ontario, Canada http://socserv.socsci.mcmaster.ca/jfox/ > -Original Message- > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of David L > Carlson > Sent: July-14-15 1:17 PM > To: INGRAM Joanne; r-help@R-project.org > Subject: Re: [R] Plot in Rcmdr > > It can be changed by slightly modifying the scatterplot() command in the > R Script window and re-submitting it. > > >From the top menu select Data | Data in packages | Read data set from > an attached package. Then type Pottery in the space next to "Enter name > of data set" (notice that Pottery is capitalized). > > >From the top menu select Graphs | Scatterplot and then select Al as the > x-variable and Ca as the y-variable. Click on Plot by groups... and > select Site (and unselect Plot lines by group). Click OK and OK again to > produce the plot. The legend is outside the plot region and the top > margin has been expanded to make room for it. > > In the R Script window you will see the command: > > scatterplot(Ca~Al | Site, reg.line=lm, smooth=TRUE, spread=TRUE, > id.method='mahal', id.n = 2, boxplots='xy', span=0.5, by.groups=FALSE, > data=Pottery) > > add a single argument to the end of the command so that it looks like > this: > > scatterplot(Ca~Al | Site, reg.line=lm, smooth=TRUE, spread=TRUE, > id.method='mahal', id.n = 2, boxplots='xy', span=0.5, by.groups=FALSE, > data=Pottery, legend.coords="topright") > > Then select all three lines and click Submit: > > The new plot puts the legend in the upper right corner of the plot > region. R Commander uses the scatterplot() function from package ca to > create the plot. It has several options that are not included on the > options dialog window in R Commander, but can be accessed simply by > editing the command that R Commander creates. > > To see these options type > > ?scatterplot > > On an empty line in the R Script window, put the cursor on the line and > click Submit. This will open your web browser with the manual page for > scatterplot. > > - > David L Carlson > Department of Anthropology > Texas A&M University > College Station, TX 77840-4352 > > > > -Original Message- > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of INGRAM > Joanne > Sent: Tuesday, July 14, 2015 9:53 AM > To: r-help@R-project.org > Subject: [R] Plot in Rcmdr > > Hello, > > I wondered if anyone could help me with a small issue in Rcmdr. > > I have used the 'Graphs' function in the drop-down menu to create a > scatterplot for groups (gender). But when I do this the legend (telling > me the symbols which represent male etc.) keeps obscuring the title of > the plot. Does anyone know how to fix this problem - within Rcmdr? > > Please note I am not looking for help with creating the graph in another > way (for example in R). I am specifically trying to figure out if this > can be fixed in Rcmdr. If the answer is "No - this cannot currently be > changed within Rcmdr" I would still like to hear from you. > > Many thanks for any help. > > Joanne Ingram > Research Associate (Medical Statistics) > Centre for Population Health Science > University of Edinburgh > > -- > The University of Edinburgh is a charitable body, registered in > Scotland, with registration number SC005336. > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] detecting any element in a vector of strings, appearing anywhere in any of several character variables in a dataframe
Dear Christopher, My usual orientation to this kind of one-off problem is that I'm looking for a simple correct solution. Computing time is usually much smaller than programming time. That said, Bert Gunter's solution was about 5 times faster in a simple check that I ran with microbenchmark, and Jeff Newmiller's solution was about 10 times faster. Both Bert's and Jeff's (eventual) solution protect against partial (rather than full-word) matches, while mine doesn't (though it could easily be modified to do that). Best, John > -Original Message- > From: Christopher W Ryan [mailto:cr...@binghamton.edu] > Sent: July-09-15 2:49 PM > To: Bert Gunter > Cc: Jeff Newmiller; R Help; John Fox > Subject: Re: [R] detecting any element in a vector of strings, appearing > anywhere in any of several character variables in a dataframe > > Thanks everyone. John's original solution worked great. And with > 27,000 records, 65 alarm.words, and 6 columns to search, it takes only > about 15 seconds. That is certainly adequate for my needs. But I > will try out the other strategies too. > > And thanks also for lot's of new R things to learn--grep, grepl, > do.call . . . that's always a bonus! > > --Chris Ryan > > On Thu, Jul 9, 2015 at 1:52 PM, Bert Gunter > wrote: > > Yup, that does it. Let grep figure out what's a word rather than doing > > it manually. Forgot about "\b" > > > > Cheers, > > Bert > > > > > > Bert Gunter > > > > "Data is not information. Information is not knowledge. And knowledge > > is certainly not wisdom." > >-- Clifford Stoll > > > > > > On Thu, Jul 9, 2015 at 10:30 AM, Jeff Newmiller > > wrote: > >> Just add a word break marker before and after: > >> > >> zz$v5 <- grepl( paste0( "\\b(", paste0( alarm.words, collapse="|" ), > ")\\b" ), do.call( paste, zz[ , 2:3 ] ) ) ) > >> - > -- > >> Jeff NewmillerThe . . Go > Live... > >> DCN:Basics: ##.#. ##.#. Live > Go... > >> Live: OO#.. Dead: OO#.. > Playing > >> Research Engineer (Solar/BatteriesO.O#. #.O#. with > >> /Software/Embedded Controllers) .OO#. .OO#. > rocks...1k > >> - > -- > >> Sent from my phone. Please excuse my brevity. > >> > >> On July 9, 2015 10:12:23 AM PDT, Bert Gunter > wrote: > >>>Jeff: > >>> > >>>Well, it would be much better (no loops!) except, I think, for one > >>>issue: "red" would match "barred" and I don't think that this is what > >>>is wanted: the matches should be on whole "words" not just string > >>>patterns. > >>> > >>>So you would need to fix up the matching pattern to make this work, > >>>but it may be a little tricky, as arbitrary whitespace characters, > >>>e.g. " " or "\n" etc. could be in the strings to be matched > separating > >>>the words or ending the "sentence." I'm sure it can be done, but > I'll > >>>leave it to you or others to figure it out. > >>> > >>>Of course, if my diagnosis is wrong or silly, please point this out. > >>> > >>>Cheers, > >>>Bert > >>> > >>> > >>>Bert Gunter > >>> > >>>"Data is not information. Information is not knowledge. And knowledge > >>>is certainly not wisdom." > >>> -- Clifford Stoll > >>> > >>> > >>>On Thu, Jul 9, 2015 at 9:34 AM, Jeff Newmiller > >>> wrote: > >>>> I think grep is better suited to this: > >>>> > >>>> zz$v5 <- grepl( paste0( alarm.words, collapse="|" ), do.call( > paste, > >>>zz[ , 2:3 ] ) ) ) > >>>> > >>>- > -- > >>>> Jeff NewmillerThe . . Go > >>>Live... > >>>> DCN:Basics: ##.#. ##.#. > Live > >>>Go... > >>>> Live: OO#.. Dead: OO#.. > >>>Playing > >>>> Research Engineer (So
Re: [R] detecting any element in a vector of strings, appearing anywhere in any of several character variables in a dataframe
Dear Chris, If I understand correctly what you want, how about the following? > rows <- apply(zz[, 2:3], 1, function(x) any(sapply(alarm.words, grepl, x=x))) > zz[rows, ] v1 v2v3 v4 3 -1.022329green turtleronald weasley 2 6 0.336599 waffle the hamsterred sparks 1 9 -1.631874 yellow giraffe with a long neck gandalf the white 1 10 1.130622 black bear gandalf the grey 2 I hope this helps, John ---- John Fox, Professor McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox/ On Wed, 08 Jul 2015 22:23:37 -0400 "Christopher W. Ryan" wrote: > Running R 3.1.1 on windows 7 > > I want to identify as a case any record in a dataframe that contains any > of several keywords in any of several variables. > > Example: > > # create a dataframe with 4 variables and 10 records > v2 <- c("white bird", "blue bird", "green turtle", "quick brown fox", > "big black dog", "waffle the hamster", "benny likes food a lot", "hello > world", "yellow giraffe with a long neck", "black bear") > v3 <- c("harry potter", "hermione grainger", "ronald weasley", "ginny > weasley", "dudley dursley", "red sparks", "blue sparks", "white dress > robes", "gandalf the white", "gandalf the grey") > zz <- data.frame(v1=rnorm(10), v2=v2, v3=v3, v4=rpois(10, lambda=2), > stringsAsFactors=FALSE) > str(zz) > zz > > # here are the keywords > alarm.words <- c("red", "green", "turtle", "gandalf") > > # For each row/record, I want to test whether the string in v2 or the > string in v3 contains any of the strings in alarm.words. And then if so, > set zz$v5=TRUE for that record. > > # I'm thinking the str_detect function in the stringr package ought to > be able to help, perhaps with some use of apply over the rows, but I > obviously misunderstand something about how str_detect works > > library(stringr) > > str_detect(zz[,2:3], alarm.words)# error: the target of the search > # must be a vector, not multiple > # columns > > str_detect(zz[1:4,2:3], alarm.words) # same error > > str_detect(zz[,2], alarm.words) # error, length of alarm.words > # is less than the number of > # rows I am using for the > # comparison > > str_detect(zz[1:4,2], alarm.words) # works as hoped when > length(alarm.words) # confining nrows > # to the length of alarm.words > > str_detect(zz, alarm.words) # obviously not right > > # maybe I need apply() ? > my.f <- function(x){str_detect(x, alarm.words)} > > apply(zz[,2], 1, my.f) # again, a mismatch in lengths ># between alarm.words and that ># in which I am searching for ># matching strings > > apply(zz, 2, my.f) # now I'm getting somewhere > apply(zz[1:4,], 2, my.f) # but still only works with 4 ># rows of the dataframe > > > # perhaps %in% could do the job? > > Appreciate any advice. > > --Chris Ryan > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] tcltk2 entry box
Dear Matthew, For file selection, see ?tcltk::tk_choose.files or ?tcltk::tkgetOpenFile . You could enter a number in a tk entry widget, but, depending upon the nature of the number, a slider or other widget might be a better choice. For a variety of helpful tcltk examples see <http://www.sciviews.org/_rgui/tcltk/>, originally by James Wettenhall but now maintained by Philippe Grosjean (the author of the tcltk2 package). (You probably don't need tcltk2 for the simple operations that you mention, but see ?tk2spinbox for an alternative to a slider.) Best, John ------- John Fox, Professor McMaster University Hamilton, Ontario, Canada http://socserv.socsci.mcmaster.ca/jfox/ > -Original Message- > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Matthew > Sent: July-08-15 8:01 PM > To: r-help > Subject: [R] tcltk2 entry box > > Is anyone familiar enough with the tcltk2 package to know if it is > possible to have an entry box where a user can enter information (such > as a path to a file or a number) and then be able to use the entered > information downstream in a R script ? > > The idea is for someone unfamiliar with R to just start an R script that > would take care of all the commands for them so all they have to do is > get the script started. However, there is always a couple of pieces of > information that will change each time the script is used (for example, > a different file will be processed by the script). So, I would like a > way for the user to input that information as the script ran. > > Matthew McCormack > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.