[R] DPLYR Multiple Mutate Statements On Same DataFrame
Hi R Helpers, I have been looking for an example of how to execute different dplyr mutate statements on the same dataframe in a single step. I show how to do what I want to do by going from df0 to df1 to df2 to df3 by applying a mutate statement to each dataframe in sequence, but I would like to know if there is a way to execute this in a single step; so simply go from df0 to df1 while executing all the transformations. See example below. Guidance would be appreciated. --John J. Sparks, Ph.D. library(dplyr) df0<-structure(list(SeqNum = c(1L, 2L, 3L, 4L, 5L, 6L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 18L, 19L, 21L, 22L, 23L), MOSTYP = c(37L, 41L, 41L, 13L, 3L, 27L, 37L, 37L, 15L, 14L, 13L, 37L, 4L, 27L, 37L, 26L, 17L, 37L, 37L, 17L), MGEMOM = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L), MGODRK = c(3L, 2L, 2L, 3L, 4L, 2L, 2L, 2L, 3L, 4L, 3L, 2L, 3L, 1L, 2L, 3L, 4L, 4L, 3L, 3L), MOSHOO = c(7L, 7L, 7L, 2L, 9L, 4L, 7L, 7L, 2L, 2L, 2L, 7L, 9L, 4L, 7L, 4L, 2L, 7L, 7L, 2L), MRELGE = c(0L, 1L, 0L, 2L, 1L, 0L, 0L, 0L, 3L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 2L, 0L, 0L, 1L), MSKB2 = c(5L, 4L, 4L, 3L, 4L, 5L, 7L, 1L, 5L, 4L, 3L, 4L, 5L, 6L, 7L, 5L, 4L, 6L, 4L, 7L), MFWEKI = c(1L, 1L, 2L, 2L, 1L, 0L, 0L, 3L, 0L, 1L, 2L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 2L, 0L), MAANTH = c(3L, 4L, 4L, 4L, 4L, 5L, 2L, 6L, 2L, 4L, 4L, 4L, 4L, 2L, 2L, 4L, 3L, 3L, 3L, 2L), MHHUUR = c(2L, 2L, 4L, 2L, 2L, 3L, 0L, 3L, 2L, 2L, 2L, 3L, 1L, 6L, 0L, 2L, 2L, 0L, 2L, 2L), MSKA = c(1L, 0L, 4L, 2L, 2L, 3L, 0L, 3L, 2L, 0L, 2L, 3L, 1L, 5L, 0L, 0L, 1L, 0L, 0L, 1L), MAUT2 = c(2L, 4L, 4L, 3L, 4L, 5L, 5L, 3L, 2L, 3L, 3L, 4L, 4L, 3L, 5L, 2L, 3L, 3L, 2L, 3L), MFALLE = c(1L, 0L, 0L, 3L, 5L, 0L, 0L, 0L, 0L, 4L, 1L, 1L, 2L, 2L, 0L, 2L, 5L, 0L, 0L, 3L), MGEMLE = c(1L, 0L, 0L, 0L, 4L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 3L, 2L, 0L), MAUT1 = c(2L, 5L, 7L, 3L, 0L, 4L, 2L, 1L, 3L, 9L, 5L, 3L, 2L, 4L, 2L, 1L, 3L, 0L, 4L, 2L), MINKGE = c(2L, 4L, 2L, 2L, 0L, 2L, 2L, 1L, 3L, 0L, 1L, 4L, 2L, 2L, 2L, 5L, 1L, 0L, 3L, 1L), MOPLHO = c(1L, 0L, 0L, 0L, 0L, 2L, 2L, 1L, 2L, 0L, 0L, 1L, 0L, 0L, 2L, 0L, 0L, 0L, 0L, 0L), MGODPR = c(1L, 2L, 2L, 0L, 1L, 3L, 2L, 3L, 2L, 1L, 2L, 3L, 0L, 3L, 2L, 2L, 2L, 0L, 2L, 1L), MAUT0 = c(8L, 6L, 9L, 7L, 5L, 9L, 6L, 7L, 6L, 5L, 4L, 7L, 8L, 5L, 6L, 7L, 5L, 9L, 9L, 5L), MSKB1 = c(0L, 2L, 4L, 1L, 0L, 5L, 2L, 7L, 2L, 0L, 3L, 3L, 3L, 4L, 2L, 0L, 2L, 3L, 3L, 1L), MSKC = c(4L, 5L, 3L, 4L, 6L, 3L, 3L, 2L, 4L, 8L, 3L, 3L, 4L, 3L, 3L, 4L, 4L, 3L, 3L, 5L), PAANHA = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), PWAPAR = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), PPERSA = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), AMOTSC = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), APERSA = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), AWAPAR = c(1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 1L, 1L, 1L, 0L, 1L, 1L), Resp = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L)), row.names = c(NA, 20L), class = "data.frame") df1<-df0 %>% mutate(across(starts_with('P'),~ifelse(.x==0, 0, ifelse(.x==1, 25, ifelse(.x==2, 75, ifelse(.x==3, 150, ifelse(.x==4, 350, ifelse(.x==5, 750, ifelse(.x==6, 3000, ifelse(.x==7, 7500, ifelse(.x==8,15000, ifelse(.x==9,3, -99 df2<-df1 %>% mutate_at(vars(MRELGE:MSKC),~ifelse(.x==0, 0, ifelse(.x==1, 5, -99))) df3<-df2 %>% mutate_at(vars(MGODRK),~ifelse(.x==0, 0, ifelse(.x==1, 5, -99))) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Is there a sexy way ...?
"Sexy code" may get a job done and demonstrate the code's knowledge of a programming language, but it often does this at the expense of clear, easy to document (i.e. annotate what the code does), easy to read, and easy to understand code. I fear that this is what this thread has developed "sexy" but not easily understandable code. While I send kudos to all of you, remember that sometimes simpler, while not as sexy can be better in the long run. ;) John David Sorkin M.D., Ph.D. Professor of Medicine, University of Maryland School of Medicine; Associate Director for Biostatistics and Informatics, Baltimore VA Medical Center Geriatrics Research, Education, and Clinical Center; PI Biostatistics and Informatics Core, University of Maryland School of Medicine Claude D. Pepper Older Americans Independence Center; Senior Statistician University of Maryland Center for Vascular Research; Division of Gerontology and Paliative Care, 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 Cell phone 443-418-5382 From: R-help on behalf of avi.e.gr...@gmail.com Sent: Friday, September 27, 2024 10:48 PM To: 'Rolf Turner'; r-help@r-project.org Subject: Re: [R] Is there a sexy way ...? Rold, We need to be clear on what makes an answer sexy! LOL! I decided it was sexy to do it in a way that nobody (normal) would and had not suggested yet. Here is an original version I will explain in a minute. Or, maybe best a bit before. Hee is the unformatted result whicvh is a tad hard to read but will be made readable soon: x <- list(`1` = c(7, 13, 1, 4, 10), `2` = c(2, 5, 14, 8, 11), `3` = c(6, 9, 15, 12, 3)) as.integer(unlist(strsplit(as.vector(paste(paste(x$`1`, x$`2`, x$`3`, sep=","), collapse=",")), split=","))) The result is: 7 2 6 13 5 9 1 14 15 4 8 12 10 11 3 After reading what others wrote, the following is more general one where any number of vectors in a list can be handled: as.integer(unlist(strsplit(as.vector(paste(do.call(paste, c(x, sep=",")), collapse=",")), split=","))) Perhaps a tad more readable is a version using the new pipe but for obvious reasons, the dplyr/magrittr pipe works better for me than having to create silly anonymous functions instead of using a period. You now have a pipeline: library(dplyr) x %>% c(sep=",") %>% do.call(paste, .) %>% paste(collapse=",") %>% as.vector() %>% strsplit(split=",") %>% unlist() %>% as.integer() And it returns the right answer! - You start with x and pipe it as - the first argument to c() and the second argument already in place is an option to later use comma as a separator - that is piped to a do.call() which takes that c() tuple and replaces the second argument of period with it. You now have taken the original data and made three text strings like so: "7,2,6" "13,5,9" "1,14,15" "4,8,12" "10,11,3" - But you want all those strings collapsed into a single long string with commas between the parts. Do another paste this time putting the substrings together and collapsing with a comma. The results is: "7,2,6,13,5,9,1,14,15,4,8,12,10,11,3" - But that is not a vector and don't ask why! - Now split that string at commas: "7" "2" "6" "13" "5" "9" "1" "14" "15" "4" "8" "12" "10" "11" "3" - and undo the odd list format it returns to flatten it back into a character vector: "7" "2" "6" "13" "5" "9" "1" "14" "15" "4" "8" "12" "10" "11" "3" - Yep it looks the same but is subtly different. Time to make it into integers or whatever: 7 2 6 13 5 9 1 14 15 4 8 12 10 11 3 Looked at after the fact, it seems so bloody obvious! And the chance of someone else trying this approach, justifiably, is low, LOL! One nice feature of the do.call is this can be extended like so: x <- list(`1` = c(7, 13, 1, 4, 10), `2` = c(2, 5, 14, 8, 11), `3` = c(6, 9, 15, 12, 3), `4` = c( 101, 102, 103, 104, 105), `5` = c(-105, -104, -103, -102, -101)) Works fine and does this for the now five columns: [1]726 101 -105 1359 102 -1041 14 15 103 -10348 12 104 -102 [21] 10 113 105 -101 My apologies to all who expected a more serious post. I have been focusing on Python lately and over there, some things are done differently albeit I probably would be using the numpy and pandas packages to do this or even a simple list comprehension using zip: # Python, not R. [
Re: [R] (no subject)
8, the values of the corresponding mean. > > > I found this solution, where db10_means is the output dataset, db10 is > my > > > initial data. > > > > > > db10_means<-db10 %>% > > >group_by(groupid) %>% > > >mutate(across(starts_with("cp"), list(mean = mean))) > > > > > > It works perfectly, except that for NA values, where it replaces to all > > > group members the NA, while in some cases, the group is made of some NA > > and > > > some values. > > > So, when I have a group of two values and one NA, I would like that for > > > those with a value, the mean is replaced, for those with NA, the NA is > > > replaced. > > > Here the mean function has not the na.rm=T option associated, but it > > > appears that this solution cannot be implemented in this case. I am not > > > even sure that this would be enough to solve my problem. > > > Thanks for any help provided. > > > > > Hello, > > > > Your data is a mess, please don't post html, this is plain text only > > list. Anyway, I managed to create a data frame by copying the data to a > > file named "rhelp.txt" and then running > > > > > > > > db10 <- scan(file = "rhelp.txt", what = character()) > > header <- db10[1:4] > > db10 <- db10[-(1:4)] |> as.numeric() > > db10 <- matrix(db10, ncol = 4L, byrow = TRUE) |> > >as.data.frame() |> > >setNames(header) > > > > str(db10) > > #> 'data.frame':25 obs. of 4 variables: > > #> $ cp1: num 1 5 3 7 10 5 2 4 8 10 ... > > #> $ cp2: num 10 2 1 4 4 5 6 4 4 15 ... > > #> $ role : num 13 5 3 6 2 8 8 7 7 3 ... > > #> $ groupid: num 4 10 7 4 7 3 7 8 8 3 ... > > > > > > And here is the data in dput format. > > > > > > > > db10 <- > >structure(list( > > cp1 = c(1, 5, 3, 7, 10, 5, 2, 4, 8, 10, 9, 2, > > 2, 20, 9, 13, 3, 4, 4, 10, 17, 8, 3, 13, 10), > > cp2 = c(10, 2, 1, 4, 4, 5, 6, 4, 4, 15, 15, 10, > > 4, 2, 11, 10, 14, 2, 4, 0, 20, 18, 4, 3, 9), > > role = c(13, 5, 3, 6, 2, 8, 8, 7, 7, 3, 10, 5, > > 11, 5, 3, 13, 12, 15, 1, 3, 15, 10, 19, 5, 2), > > groupid = c(4, 10, 7, 4, 7, 3, 7, 8, 8, 3, 2, 5, > > 20, 12, 6, 4, 6, 7, 16, 7, 3, 7, 8, 20, 6)), > > class = "data.frame", row.names = c(NA, -25L)) > > > > > > > > As for the problem, I am not sure if you want summarise instead of > > mutate but here is a summarise solution. > > > > > > > > library(dplyr) > > > > db10 %>% > >group_by(groupid) %>% > >summarise(across(starts_with("cp"), ~ mean(.x, na.rm = TRUE))) > > > > # same result, summarise's new argument .by avoids the need to group_by > > db10 %>% > >summarise(across(starts_with("cp"), ~ mean(.x, na.rm = TRUE)), .by = > > groupid) > > > > > > > > Can you post the expected output too? > > > > Hope this helps, > > > > Rui Barradas > > > > > > -- > > Este e-mail foi analisado pelo software antivírus AVG para verificar a > > presença de vírus. > > www.avg.com > > > > > -- > > Francesca > > > -- > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > https://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- John Kane Kingston ON Canada [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Linear regression and stand deviation at the Linux command line
Keith, I suggest you being by looking at a web page https://www.rdocumentation.org/packages/stats/versions/3.6.2/topics/lm It will introduce you to the lm function, the function that performs liner regression and the summary function which returns some of the material you are looking for. The page come complete with code that can be run via he web page. Once you review the web page, and hopefully try to run the analysis you want to run, you can again ask the R help list for additional help. There are other web pages that can help you, for example https://www.statology.org/logistic-regression-in-r/#:~:text=How%20to%20Perform%20Logistic%20Regression%20in%20R%20%28Step-by-Step%29,Predictions%20...%205%20Step%205%3A%20Model%20Diagnostics%20 Take the first steps, show that you are trying and the R help list will be very helpful. John John David Sorkin M.D., Ph.D. Professor of Medicine, University of Maryland School of Medicine; Associate Director for Biostatistics and Informatics, Baltimore VA Medical Center Geriatrics Research, Education, and Clinical Center; PI Biostatistics and Informatics Core, University of Maryland School of Medicine Claude D. Pepper Older Americans Independence Center; Senior Statistician University of Maryland Center for Vascular Research; Division of Gerontology and Paliative Care, 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 Cell phone 443-418-5382 From: R-help on behalf of Keith Christian Sent: Thursday, August 22, 2024 3:07 PM To: r-help@r-project.org Subject: [R] Linear regression and stand deviation at the Linux command line R List, Please excuse this ultra-newbie post. I looked at this page but it's a bit beyond me. https://www2.kenyon.edu/Depts/Math/hartlaub/Math305%20Fall2011/R.htm I'm interested in R construct(s) to be entered at the command line that would output slope, y-intercept, and r-squared values read from a csv or other filename entered at the command line, and the same for standard deviation calculations, namely the standard deviation, variance, and z-scores for every data point in the file. E.g. $ ((R function for linear regression here))slope, y-intercept, and r-squared, other related stats that R seems most capable of generating. linear_regression_data.csv file contents (Are line numbers, commas, etc. needed or no?) 1 20279 2 899 3 24747 4 12564 5 29543 $ ((R function for standard deviation here))standard deviation, variance, z-scores, other related stats that R seems most capable of generating. standard_deviation_data.csv file contents (Are line numbers, commas, etc. needed or no?) 1 16837 2 9498 3 31389 4 2365 5 17384 Many thanks, --Keith __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Manually calculating values from aov() result
Dear Brian, As Duncan mentioned, the terms type-I, II, and III sums of squares originated in SAS. The type-II and III SSs computed by the Anova() function in the car package take a different computational approach than in SAS, but in almost all cases produce the same results. (I slightly regret using the "type-*" terminology for car::Anova() because of the lack of exact correspondence to SAS.) The standard R anova() function computes type-I (sequential) SSs. The focus, however, shouldn't be on the SSs, or how they're computed, but on the hypotheses that are tested. Briefly, the hypotheses for type-I tests assume that all terms later in the sequence are 0 in the population; type-II tests assume that interactions to which main effects are marginal (and higher-order interactions to which lower-order interactions are marginal) are 0. Type-III tests don't, e.g., assume that interactions to which a main effect are marginal are 0 in testing the main effect, which represents an average over levels of the factor(s) with which the factor in the main effect interact. The description of the hypotheses for type-III tests is even more complex if there are covariates. In my opinion, researchers are usually interested in the hypotheses for type-II tests. These matters are described in detail, for example, in my applied regression text <https://www.john-fox.ca/AppliedRegression/index.html>. I hope this helps, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://www.john-fox.ca/ -- On 2024-08-07 8:27 a.m., Brian Smith wrote: [You don't often get email from briansmith199...@gmail.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] Caution: External email. Hi, Thanks for this information. Is there any way to force R to use Type-1 SS? I think most textbooks use this only. Thanks and regards, On Wed, 7 Aug 2024 at 17:00, Duncan Murdoch wrote: On 2024-08-07 6:06 a.m., Brian Smith wrote: Hi, I have performed ANOVA as below dat = data.frame( 'A' = c(-0.3960025, -0.3492880, -1.5893792, -1.4579074, -4.9214873, -0.8575018, -2.5551363, -0.9366557, -1.4307489, -0.3943704), 'B' = c(2,1,2,2,1,2,2,2,2,2), 'C' = c(0,1,1,1,1,1,1,0,1,1)) summary(aov(A ~ B * C, dat)) However now I also tried to calculate SSE for factor C Mean = sapply(split(dat, dat$C), function(x) mean(x$A)) N = sapply(split(dat, dat$C), function(x) dim(x)[1]) N[1] * (Mean[1] - mean(dat$A))^2 + N[2] * (Mean[2] - mean(dat$A))^2 #1.691 But in ANOVA table the sum-square for C is reported as 0.77. Could you please help how exactly this C = 0.77 is obtained from aov() Your design isn't balanced, so there are several ways to calculate the SS for C. What you have calculated looks like the "Type I SS" in SAS notation, if I remember correctly, assuming that C enters the model before B. That's not what R uses; I think it is Type II SS. For some details about this, see https://mcfromnz.wordpress.com/2011/03/02/anova-type-ii-ss-explained/ __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Create matrix with variable number of columns AND CREATE NAMES FOR THE COLUMNS
#I am trying to write code that will create a matrix with a variable number of columns where the #number of columns is 1+Grps #I can do this: NSims <- 4 Grps <- 5 DiffMeans <- matrix(nrow=NSims,ncol=1+Grps) DiffMeans #I have a problem when I try to name the columns of the matrix. I want the first column to be NSims, #and the other columns to be something like Value1, Value2, . . . Valuen where N=Grps # I wrote a function to build a list of length Grps createValuelist <- function(num_elements) { for (i in 1:num_elements) { cat("Item", i, "\n", sep = "") } } createValuelist(Grps) # When I try to assign column names I receive an error: #Error in dimnames(DiffMeans) <- list(NULL, c("NSim", createValuelist(Grps))) : # length of 'dimnames' [2] not equal to array extent dimnames(DiffMeans) <- list(NULL,c("NSim",createValuelist(Grps))) DiffMeans # Thank you for your help! John David Sorkin M.D., Ph.D. Professor of Medicine, University of Maryland School of Medicine; Associate Director for Biostatistics and Informatics, Baltimore VA Medical Center Geriatrics Research, Education, and Clinical Center; PI Biostatistics and Informatics Core, University of Maryland School of Medicine Claude D. Pepper Older Americans Independence Center; Senior Statistician University of Maryland Center for Vascular Research; Division of Gerontology and Paliative Care, 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 Cell phone 443-418-5382 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regression performance when using summary() twice
Dear Christian, You're apparently using the glm.nb() function in the MASS package. Your function is peculiar in several respects. For example, you specify the model formula as a character string and then convert it into a formula, but you could just pass the formula to the function -- the conversion seems unnecessary. Similarly, you compute the summary for the model twice rather than just saving it in a local variable in your function. And the form of the function output is a bit strange, but I suppose you have reasons for that. The primary reason that your function is slow, however, is that the confidence intervals computed by confint() profile the likelihood, which requires refitting the model a number of times. If you're willing to use possibly less accurate Wald-based rather than likelihood-based confidence intervals, computed, e.g., by the Confint() function in the car package, then you could speed up the computation considerably, Using a model fit by example(glm.nb), library(MASS) example(glm.nb) microbenchmark::microbenchmark( Wald = car::Confint(quine.nb1, vcov.=vcov(quine.nb1), estimate=FALSE), LR = confint(quine.nb1) ) which produces Unit: microseconds expr min lq meanmedian uqmax Wald 136.366 161.13 222.0872 184.541 283.72386.466 LR 87223.031 88757.09 95162.8733 95761.568 97672.23 182734.048 neval 100 100 I hope this helps, Johm -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://www.john-fox.ca/ -- On 2024-06-21 10:38 a.m., c.bu...@posteo.jp wrote: [You don't often get email from c.bu...@posteo.jp. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] Caution: External email. Hello, I am not a regular R user but coming from Python. But I use R for several special task. Doing a regression analysis does cost some compute time. But I wonder when this big time consuming algorithm is executed and if it is done twice in my sepcial case. It seems that calling "glm()" or similar does not execute the time consuming part of the regression code. It seems it is done when calling "summary(model)". Am I right so far? If this is correct I would say that in my case the regression is down twice with the identical formula and data. Which of course is inefficient. See this code: my_function <- function(formula_string, data) { formula <- as.formula(formula_string) model <- glm.nb(formula, data = data) result = cbind(summary(model)$coefficients, confint(model)) result = as.data.frame(result) string_result = capture.output(summary(model)) return(list(result, string_result)) } I do call summary() once to get the "$coefficents" and a second time when capturing its output as a string. If this really result in computing the regression twice I ask myself if there is a R-way to make this more efficent? Best regards, Christian Buhtz __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Column names of model.matrix's output with contrast.arg
Dear Christophe and Ben, Also see the car package for replacements for contr.treatment(), contr.sum(), and contr.helmert() -- e.g., help("contr.Sum", package="car"). These functions have been in the car package for more than two decades, and AFAIK, no one uses them (including myself). I didn't write a replacement for contr.poly() because the current coefficient labeling seemed reasonably transparent. Best, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://www.john-fox.ca/ -- On 2024-06-17 4:29 p.m., Ben Bolker wrote: Caution: External email. It's sorta-kinda-obliquely-partially documented in the examples: zapsmall(cP <- contr.poly(3)) # Linear and Quadratic output: .L .Q [1,] -0.7071068 0.4082483 [2,] 0.000 -0.8164966 [3,] 0.7071068 0.4082483 FWIW the faux package provides better-named alternatives. On 2024-06-17 4:25 p.m., Christophe Dutang wrote: Thanks for your reply. It might good to document the naming convention in ?contrasts. It is hard to understand .L for linear, .Q for quadratic, .C for cubic and ^n for other degrees. For contr.sum, we could have used .Sum, .Sum… Maybe the examples ?model.matrix should use names in dd objects so that we observe when names are dropped. Kind regards, Christophe Le 14 juin 2024 à 11:45, peter dalgaard a écrit : You're at the mercy of the various contr.XXX functions. They may or may not set the colnames on the matrices that they generate. The rationales for (not) setting them is not perfectly transparent, but you obviously cannot use level names on contr.poly, so it uses .L, .Q, etc. In MASS, contr.sdif is careful about labeling the columns with the levels that are being diff'ed. For contr.treatment, there is a straightforward connection to 0/1 dummy variables, so level names there are natural. One could use levels in contr.sum and contr.helmert, but it might confuse users that comparisons are with the average of all levels or preceding levels. (It can be quite confusing when coding is +1 for male and -1 for female, so that the gender difference is twice the coefficient.) -pd On 14 Jun 2024, at 08:12 , Christophe Dutang wrote: Dear list, Changing the default contrasts used in glm() makes me aware how model.matrix() set column names. With default contrasts, model.matrix() use the level values to name the columns. However with other contrasts, model.matrix() use the level indexes. In the documentation, I don’t see anything in the documentation related to this ? It does not seem natural to have such a behavior? Any comment is welcome. An example is below. Kind regards, Christophe #example from ?glm counts <- c(18,17,15,20,10,20,25,13,12) outcome <- paste0("O", gl(3,1,9)) treatment <- paste0("T", gl(3,3)) X3 <- model.matrix(counts ~ outcome + treatment) X4 <- model.matrix(counts ~ outcome + treatment, contrasts = list("outcome"="contr.sum")) X5 <- model.matrix(counts ~ outcome + treatment, contrasts = list("outcome"="contr.helmert")) #check with original factor cbind.data.frame(X3, outcome) cbind.data.frame(X4, outcome) cbind.data.frame(X5, outcome) #same issue with glm glm.D93 <- glm(counts ~ outcome + treatment, family = poisson()) glm.D94 <- glm(counts ~ outcome + treatment, family = poisson(), contrasts = list("outcome"="contr.sum")) glm.D95 <- glm(counts ~ outcome + treatment, family = poisson(), contrasts = list("outcome"="contr.helmert")) coef(glm.D93) coef(glm.D94) coef(glm.D95) #check linear predictor cbind(X3 %*% coef(glm.D93), predict(glm.D93)) cbind(X4 %*% coef(glm.D94), predict(glm.D94)) - Christophe DUTANG LJK, Ensimag, Grenoble INP, UGA, France ILB research fellow Web: http://dutangc.free.fr __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dr. Benjamin Bolker Professor, Mathematics & Statistics and Biology, McMaster University Director, School of Computational Science and Engineering (Acting) Graduate chair, Mathe
[R] Can't compute row means of two columns of a dataframe.
I have a data frame with three columns, TotalInches, Low20, High20. For each row of the dataset, I am trying to compute the mean of Low20 and High20. xxxz <- structure(list(TotalInches = c(58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76), Low20 = c(84, 87, 90, 93, 96, 99, 102, 106, 109, 112, 116, 119, 122, 126, 129, 133, 137, 141, 144), High20 = c(111, 115, 119, 123, 127, 131, 135, 140, 144, 148, 153, 157, 162, 167, 171, 176, 181, 186, 191 )), class = "data.frame", row.names = c(NA, -19L)) xxxz str(xxxz) xxxz$Average20 <- by(xxxz[,c("Low20","High20")],xxxz[,"TotalInches"],mean) warnings() When I run the code above, I don't get the means by row. I get the following warning messages, one for each row of the dataframe. Warning messages: 1: In mean.default(data[x, , drop = FALSE], ...) : argument is not numeric or logical: returning NA 2: In mean.default(data[x, , drop = FALSE], ...) : argument is not numeric or logical: returning NA Can someone tell my what I am doing wrong, and how I can compute the row means? Thank you, John John David Sorkin M.D., Ph.D. Professor of Medicine, University of Maryland School of Medicine; Associate Director for Biostatistics and Informatics, Baltimore VA Medical Center Geriatrics Research, Education, and Clinical Center; PI Biostatistics and Informatics Core, University of Maryland School of Medicine Claude D. Pepper Older Americans Independence Center; Senior Statistician University of Maryland Center for Vascular Research; Division of Gerontology and Paliative Care, 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 Cell phone 443-418-5382 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Listing folders on One Drive
Dear Nick, See list.dirs(), which is documented in the same help file as list.files(). I hope this helps, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://www.john-fox.ca/ -- On 2024-05-20 9:36 a.m., Nick Wray wrote: [You don't often get email from nickmw...@gmail.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] Caution: External email. Hello I have lots of folders of individual Scottish river catchments on my uni One Drive. Each folder is labelled with the river name eg "Tay" and they are all in a folder named "Scotland" I want to list the folders on One Drive so that I can cross check that I have them all against a list of folders on my laptop. Can I somehow use list.files() - I've tried various things but none seem to work... Any help appreciated Thanks Nick Wray [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Print date on y axis with month, day, and year
I am trying to use ggplot to plot the data, and R code, below. The dates (jdate) are printing as Mar 01, Mar 15, etc. I want to have the date printed as MMM DD (or any other way that will show month, date, and year, e.g. mm/dd/yy). How can I accomplish this? yyy <- structure(list( jdate = structure(c(19052, 19053, 19054, 19055, 19058, 19059, 19060, 19061, 19062, 19063, 19065, 19066, 19067, 19068, 19069, 19072, 19073, 19074, 19075, 19076, 19077, 19083, 19086, 19087, 19088, 19089, 19090, 19093, 19094, 19095), class = "Date"), Sum = c ( 1, 3, 9, 11, 13, 16, 18, 22, 26, 27, 30, 32, 35, 39, 41, 43, 48, 51, 56, 58, 59, 63, 73, 79, 81, 88, 91, 93, 96, 103)), row.names = c(NA, 30L), class = "data.frame") yyy class(yyy$jdate) ggplot(data=yyy[1:30,],aes(as.Date(jdate,format="%m-%d-%Y"),Sum)) +geom_point() Thank you John John David Sorkin M.D., Ph.D. Professor of Medicine, University of Maryland School of Medicine; Associate Director for Biostatistics and Informatics, Baltimore VA Medical Center Geriatrics Research, Education, and Clinical Center; PI Biostatistics and Informatics Core, University of Maryland School of Medicine Claude D. Pepper Older Americans Independence Center; Senior Statistician University of Maryland Center for Vascular Research; Division of Gerontology and Paliative Care, 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 Cell phone 443-418-5382 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] x[0]: Can '0' be made an allowed index in R?
Hello Peter, Unless I too misunderstand your point, negative indices for removal do work with the Oarray package (though -0 doesn't work to remove the 0th element, since -0 == 0 -- perhaps what you meant): > library(Oarray) > v <- Oarray(1:10, offset=0) > v [0,] [1,] [2,] [3,] [4,] [5,] [6,] [7,] [8,] [9,] 123456789 10 > dim(v) [1] 10 > v[-1] [1] 1 3 4 5 6 7 8 9 10 > v[-0] [1] 1 Best, John On 2024-04-23 9:03 a.m., Peter Dalgaard via R-help wrote: Caution: External email. Doesn't sound like you got the point. x[-1] normally removes the first element. With 0-based indices, this cannot work. - pd On 22 Apr 2024, at 17:31 , Ebert,Timothy Aaron wrote: You could have negative indices. There are two ways to do this. 1) provide a large offset. Offset <- 30 for (i in -29 to 120) { print(df[i+Offset])} 2) use absolute values if all indices are negative. for (i in -200 to -1) {print(df[abs(i)])} Tim -Original Message- From: R-help On Behalf Of Peter Dalgaard via R-help Sent: Monday, April 22, 2024 10:36 AM To: Rolf Turner Cc: R help project ; Hans W Subject: Re: [R] x[0]: Can '0' be made an allowed index in R? [External Email] Heh. Did anyone bring up negative indices yet? -pd On 22 Apr 2024, at 10:46 , Rolf Turner wrote: See fortunes::fortune(36). cheers, Rolf Turner -- Honorary Research Fellow Department of Statistics University of Auckland Stats. Dep't. (secretaries) phone: +64-9-373-7599 ext. 89622 Home phone: +64-9-480-4619 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat/ .ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C02%7Ctebert%40ufl.edu %7C79ca6aadcaee4aa3241308dc62d986f6%7C0d4da0f84a314d76ace60a62331e1b84 %7C0%7C0%7C638493933686698527%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw MDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata= wmv9OYcMES0nElT9OAKTdjBk%2BB55bQ7BjxOuaVVkPg4%3D&reserved=0 PLEASE do read the posting guide http://www.r/ -project.org%2Fposting-guide.html&data=05%7C02%7Ctebert%40ufl.edu%7C79 ca6aadcaee4aa3241308dc62d986f6%7C0d4da0f84a314d76ace60a62331e1b84%7C0% 7C0%7C638493933686711061%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiL CJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=AP78X nfKrX6B0YVM0N76ty9v%2Fw%2BchHIytw33X7M9umE%3D&reserved=0 and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question regarding reservoir volume and water level
Aside from the fact that the original question might well be a class exercise (or homework), the question is unanswerable given the data given by the original poster. One needs to know the dimensions of the reservoir, above and below the current waterline. Are the sides, above and below the waterline smooth? Is the region currently above the waterline that can store water a mirror image of the region below the waterline? Is the region above the reservoir include a flood plane? Will the additional water go into the flood plane? The lack of required detail in the question posed by the original poster suggests that there are strong assumptions, assumptions that typically would be made in a class-room example or exercise. John John David Sorkin M.D., Ph.D. Professor of Medicine, University of Maryland School of Medicine; Associate Director for Biostatistics and Informatics, Baltimore VA Medical Center Geriatrics Research, Education, and Clinical Center; PI Biostatistics and Informatics Core, University of Maryland School of Medicine Claude D. Pepper Older Americans Independence Center; Senior Statistician University of Maryland Center for Vascular Research; Division of Gerontology and Paliative Care, 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 Cell phone 443-418-5382 From: R-help on behalf of Rui Barradas Sent: Sunday, April 7, 2024 10:53 AM To: javad bayat; R-help Subject: Re: [R] Question regarding reservoir volume and water level Às 13:27 de 07/04/2024, javad bayat escreveu: > Dear all; > I have a question about the water level of a reservoir, when the volume > changed or doubled. > There is a DEM file with the highest elevation 1267 m. The lowest elevation > is 1230 m. The current volume of the reservoir is 7,000,000 m3 at 1240 m. > Now I want to know what would be the water level if the volume rises to > 1250 m? or what would be the water level if the volume doubled (14,000,000 > m3)? > > Is there any way to write codes to do this in R? > I would be more than happy if anyone could help me. > Sincerely > > > > > > > > Hello, This is a simple rule of three. If you know the level l the argument doesn't need to be named but if you know the volume v then it must be named. water_level <- function(l, v, level = 1240, volume = 7e6) { if(missing(v)) { volume * l / level } else level * v / volume } lev <- 1250 vol <- 14e6 water_level(l = lev) #> [1] 7056452 water_level(v = vol) #> [1] 2480 Hope this helps, Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. http://www.avg.com/ __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Rtools and things dependent on it
David, I greatly appreciate the explanation you gave regarding R tools providing tools available in Linux distros, but not found in Windows. (I am using a windows system). Does this mean that Linux users don't need to use R tools when they want to compile R code? Additionally, thank you for the information about what I should read. I will look at the material again, and hopefully things the material you suggest I read will be more understandable. John P.S. This email should be in txt format, not html. I sent if from my desktop windows machine which provides more options than does my iPhone. John David Sorkin M.D., Ph.D. Professor of Medicine, University of Maryland School of Medicine; Associate Director for Biostatistics and Informatics, Baltimore VA Medical Center Geriatrics Research, Education, and Clinical Center; PI Biostatistics and Informatics Core, University of Maryland School of Medicine Claude D. Pepper Older Americans Independence Center; Senior Statistician University of Maryland Center for Vascular Research; Division of Gerontology and Paliative Care, 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 Cell phone 443-418-5382 From: David Winsemius Sent: Friday, February 23, 2024 8:14 PM To: Sorkin, John Cc: avi.e.gr...@gmail.com; r-help@r-project.org Subject: Re: [R] Rtools and things dependent on it On 2/23/24 16:28, Sorkin, John wrote: David, My apologies regarding the format of my email. I am replying using my iPhone, and I can’t find a way to switch from what I suspect is html to txt format. The link you sent told me that R tools allows compilation of code. It's specifically designed to provide the code tools missing in Windows that would other wise have been provided by a typical Linux distro. More expansively, it allows compilation of code written in C and/or Fortran using the version that was used to build the matching R version and allows it to be called by the routines written in R that bind a package together. This is good to know, but beyond this important fact, the rest of the material was close to unintelligible. The phrase "the rest of the material" is not specific enough to offer more explanation. You should quote material that is beyond your understanding. You should only be reading the sections named: "Installing Rtools43" and "Building packages from source using Rtools43". I doubt that material further on would be relevant. -- David I doubt this is the fault of the author, it is probably because I lack some basic knowledge. Can you suggest some more basic material I can read. Please note. I am not computer naive, I am simply missing basic knowledge of the material discussed in the web page. Thank you, John John David Sorkin M.D., Ph.D. Professor of Medicine Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) On Feb 23, 2024, at 7:01 PM, David Winsemius <mailto:dwinsem...@comcast.net> wrote: On 2/23/24 14:34, avi.e.gr...@gmail.com<mailto:avi.e.gr...@gmail.com> wrote: This may be a dumb question and the answer may make me feel dumber. I have had trouble for years with R packages wanting Rtools on my machine and not being able to use it. Many packages are fine as binaries are available. I have loaded Rtools and probably need to change my PATH or something. I suppose making sure that whatever directory holds your Rtools code is on your path would be a good idea. I wondered if there's an environment variable that could be set, but reading the page on using Rtools did not mention one until I got down to the section on building R from source which is surely NOT what you want to do.. You should read the information on installation and building packages from source. https://cran.r-project.org/bin/windows/base/howto-R-devel.html<https://cran.r-project.org/bin/windows/base/howto-R-devel.html> which includes this sentence: "It is recommended to use the defaults and install into|c:/rtools43|. When done that way, Rtools43 may be used in the same R session which installed it or which was started before Rtools43 was installed." But I recently suggested to someone that they might want to use the tabyl() function in the janitor package that I find helpful. I get a warning when I install it about Rtools but it works fine. When they install it, it fails. I assumed they would get it from CRAN the same way I did as we are both using Windows and from within RSTUDIO. In the past, I have run into other packages I could not use and just moved on but it seems like time to see if this global problem has a work-around. And, in particular, I have the latest versions of
Re: [R] Rtools and things dependent on it
Avi , Your question is not dumb. Let me ask a more fundamental question. What is R tools, what does it do, and how is it used. From time to time, I receive a message when I down load a package saying I need R tools. When I receive the message, I don’t know what I should do, other than down load R tools. John John David Sorkin M.D., Ph.D. Professor of Medicine Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) On Feb 23, 2024, at 5:34 PM, avi.e.gr...@gmail.com wrote: This may be a dumb question and the answer may make me feel dumber. I have had trouble for years with R packages wanting Rtools on my machine and not being able to use it. Many packages are fine as binaries are available. I have loaded Rtools and probably need to change my PATH or something. But I recently suggested to someone that they might want to use the tabyl() function in the janitor package that I find helpful. I get a warning when I install it about Rtools but it works fine. When they install it, it fails. I assumed they would get it from CRAN the same way I did as we are both using Windows and from within RSTUDIO. In the past, I have run into other packages I could not use and just moved on but it seems like time to see if this global problem has a work-around. And, in particular, I have the latest versions of both R and RSTUDIO which can be a problem when other things are not as up-to-date. Or, maybe some people with R packages could be convinced to make binaries available in the first place? Avi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C02%7CJSorkin%40som.umaryland.edu%7C8d5f2c8346f24559a7f908dc34bf9979%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638443244987424663%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C6%7C%7C%7C&sdata=BO9wgkrjNmI4j2deiBDxHw%2F9tVjynfQYEHhBZ8BGq%2Fk%3D&reserved=0 PLEASE do read the posting guide https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=05%7C02%7CJSorkin%40som.umaryland.edu%7C8d5f2c8346f24559a7f908dc34bf9979%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638443244987432863%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C6%7C%7C%7C&sdata=kVnTbE6ZEpmJ88Zmu%2FUbUH%2F%2FnjoSHSmDjuIxxxw3uz8%3D&reserved=0 and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggarrange & legend
Blast it hit send by accident. Anyway the code above is a WWE. I don't see any obvious way no move the legend On Mon, 5 Feb 2024 at 09:13, John Kane wrote: > I'm sorry but that is not a working example. > > A working example needs to create the plots being used. > > For example, stealing some code from > https://rpkgs.datanovia.com/ggpubr/reference/ggarrange.html > #= > > data <https://rdrr.io/r/utils/data.html>("ToothGrowth")df <- > ToothGrowthdf$dose <- as.factor > <https://rdrr.io/r/base/factor.html>(df$dose)# Box plotbxp <- ggboxplot > <https://rpkgs.datanovia.com/ggpubr/reference/ggboxplot.html>(df, x = "dose", > y = "len",color = "dose", palette = "jco")# Density plotdens <- ggdensity > <https://rpkgs.datanovia.com/ggpubr/reference/ggdensity.html>(df, x = "len", > fill = "dose", palette = "jco") > > mylist<-list(bxp, dens) > > dev.new(width=28, height=18) > > fig1<- ggarrange(plotlist=mylist, common.legend = TRUE, legend="top", labels > = c("(A)", "(B)"), font.label = list(size = 18, color = "black"), ncol=2) > > fig1 > #= > > > On Mon, 5 Feb 2024 at 08:44, wrote: > >> Dear John Kane >> >> Dear R community >> >> >> >> Here my working example >> >>1. Example that is working with legend=”top”. However, as mentioned, >>the legend is in the middle of the top axis. >> >> mylist<-list(p1, p2) >> >> dev.new(width=28, height=18) >> >> fig1<- ggarrange(plotlist=mylist, common.legend = TRUE, legend="top", >> labels = c("(A)", "(B)"), font.label = list(size = 18, color = "black"), >> ncol=2) >> >> fig1 >> >> >> >>1. My question is how I can position the legend on the topright of >>the top axis. However, “topright” is not a common label for legend in >>ggarrange (but in other plot functions), so legend =”topright” is not >>working. >> >> mylist<-list(p1, p2) >> >> dev.new(width=28, height=18) >> >> fig1<- ggarrange(plotlist=mylist, common.legend = TRUE, >> legend="topright", labels = c("(A)", "(B)"), font.label = list(size = 18, >> color = "black"), ncol=2) >> >> fig1 >> >> >> >> Kind regards >> >> Sibylle >> >> >> >> *From:* John Kane >> *Sent:* Monday, February 5, 2024 1:59 PM >> *To:* sibylle.stoec...@gmx.ch >> *Cc:* r-help@r-project.org >> *Subject:* Re: [R] ggarrange & legend >> >> >> >> Could you supply us with a MWE (minimal working example)of what you have >> so far? >> >> Thanks. >> >> >> >> On Mon, 5 Feb 2024 at 05:00, SIBYLLE STÖCKLI via R-help < >> r-help@r-project.org> wrote: >> >> Dear R community >> >> It is possible to adjust the legend in combined ggplots using ggarrange >> with >> be positions top, bottom, left and right. >> My question: Is there a function to change the position of the legend to >> topright or bottomleft? Right and top etc are in the middle of the axis. >> >> Kind regards >> Sibylle >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> >> >> -- >> >> John Kane >> Kingston ON Canada >> > > > -- > John Kane > Kingston ON Canada > -- John Kane Kingston ON Canada [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggarrange & legend
I'm sorry but that is not a working example. A working example needs to create the plots being used. For example, stealing some code from https://rpkgs.datanovia.com/ggpubr/reference/ggarrange.html #= data <https://rdrr.io/r/utils/data.html>("ToothGrowth")df <- ToothGrowthdf$dose <- as.factor <https://rdrr.io/r/base/factor.html>(df$dose)# Box plotbxp <- ggboxplot <https://rpkgs.datanovia.com/ggpubr/reference/ggboxplot.html>(df, x = "dose", y = "len",color = "dose", palette = "jco")# Density plotdens <- ggdensity <https://rpkgs.datanovia.com/ggpubr/reference/ggdensity.html>(df, x = "len", fill = "dose", palette = "jco") mylist<-list(bxp, dens) dev.new(width=28, height=18) fig1<- ggarrange(plotlist=mylist, common.legend = TRUE, legend="top", labels = c("(A)", "(B)"), font.label = list(size = 18, color = "black"), ncol=2) fig1 #= On Mon, 5 Feb 2024 at 08:44, wrote: > Dear John Kane > > Dear R community > > > > Here my working example > >1. Example that is working with legend=”top”. However, as mentioned, >the legend is in the middle of the top axis. > > mylist<-list(p1, p2) > > dev.new(width=28, height=18) > > fig1<- ggarrange(plotlist=mylist, common.legend = TRUE, legend="top", > labels = c("(A)", "(B)"), font.label = list(size = 18, color = "black"), > ncol=2) > > fig1 > > > >1. My question is how I can position the legend on the topright of the >top axis. However, “topright” is not a common label for legend in ggarrange >(but in other plot functions), so legend =”topright” is not working. > > mylist<-list(p1, p2) > > dev.new(width=28, height=18) > > fig1<- ggarrange(plotlist=mylist, common.legend = TRUE, legend="topright", > labels = c("(A)", "(B)"), font.label = list(size = 18, color = "black"), > ncol=2) > > fig1 > > > > Kind regards > > Sibylle > > > > *From:* John Kane > *Sent:* Monday, February 5, 2024 1:59 PM > *To:* sibylle.stoec...@gmx.ch > *Cc:* r-help@r-project.org > *Subject:* Re: [R] ggarrange & legend > > > > Could you supply us with a MWE (minimal working example)of what you have > so far? > > Thanks. > > > > On Mon, 5 Feb 2024 at 05:00, SIBYLLE STÖCKLI via R-help < > r-help@r-project.org> wrote: > > Dear R community > > It is possible to adjust the legend in combined ggplots using ggarrange > with > be positions top, bottom, left and right. > My question: Is there a function to change the position of the legend to > topright or bottomleft? Right and top etc are in the middle of the axis. > > Kind regards > Sibylle > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > > > -- > > John Kane > Kingston ON Canada > -- John Kane Kingston ON Canada [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggarrange & legend
Could you supply us with a MWE (minimal working example)of what you have so far? Thanks. On Mon, 5 Feb 2024 at 05:00, SIBYLLE STÖCKLI via R-help < r-help@r-project.org> wrote: > Dear R community > > It is possible to adjust the legend in combined ggplots using ggarrange > with > be positions top, bottom, left and right. > My question: Is there a function to change the position of the legend to > topright or bottomleft? Right and top etc are in the middle of the axis. > > Kind regards > Sibylle > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- John Kane Kingston ON Canada [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] DrDimont package in R
Nothing got through. Try plain text rather than HTML. On Mon, 5 Feb 2024 at 06:04, Anas Jamshed wrote: > > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- John Kane Kingston ON Canada [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Use of geometric mean .. in good data analysis
I've advised people consulting me that if their data is loaded with zeros, while they are absolutely certain that something should be where the zeros are, then they either need a better measuring tool, or to carefully document the results of limits on detectability and then note what fraction of the data is really below instrument limits. It's important information as it stands, but they don't want to go writing fairy tales based on things not seen. On 1/22/24 12:57, Jeff Newmiller via R-help wrote: > Still OT... but here is my own (I think previously mentioned here) rant on > people thrashing about with log transformation and an all-too-common kludge > to deal with zeros mixed among small > numbers...https://gist.github.com/jdnewmil/99301a88de702ad2fcbaef33326b08b4 > > OP perhaps posting a link here to your question posed wherever you end up > with it will help shorten this thread. > > On January 22, 2024 12:23:20 PM PST, Bert Gunter > wrote: >> Ah LOD's, typically LLOD's ("lower limits of detection"). >> >> Disclaimer: I am *NOT* in any sense an expert on such matters. What follows >> are just some comments based on my personal experience. Please filter >> accordingly. Also, while I kept it on list as Martin suggested it might be >> useful to do so, most folks probably can safely ignore the rant that >> follows as off topic and not of interest. So you've been warned!! >> >> The rant: >> My experience is: data that contain a "bunch" of values that are, e.g. >> below a LLOD, are frequently reported and/or analyzed by various ad hoc, >> and imho, uniformly bad methods. e.g.: >> >> 1) The censored values are recorded and analyzed as at the LLOD; >> 2) The censored values are recorded and analyzed at some arbitrary value >> below the LLOD, like LLOD/2; >> 3) The censored values are are "imputed" by ad hoc methods, e.g. uniform >> random values between 0 and the LLOD for left censoring. >> >> To repeat, *IMO*, all of this is junk and will produced misleading >> statistical results. Whether they mislead enough to substantively affect >> the science or regulatory decisions depend on the specifics of the >> circumstances. I accept no general claim as to their innocuousness. >> >> Further: >> >> a) When you have a "lot" of values -- 50%? 75%?, 25%? -- face facts: you >> have (practically) no useful information from the values that you do have >> to infer what the distribution of values that you don't have looks like. >> All one can sensibly do is say that x% of the values are below a LOD and >> here's the distribution of what lies above. Presumably, if you have such >> data conditional on covariates with the obvious intent to determine the >> relationship to those covariates, you could analyze the percentages of >> LLOD's and known values separately. There are undoubtedly more >> sophisticated methods out there, so this is where you need to go to the >> literature to see what might suit; though I think it will still have to >> come down to looking at these separately (e.g. with extra parameters to >> account for unmeasurable values). Another way of saying this is: any >> analysis which treats all the data as arising from a single distribution >> will depend more on the assumptions you make than on the data. So good luck >> with that! >> >> b) If you have a "modest" amount of (known) censoring -- 5%?, 20%? 10%? -- >> methods for the analysis of censored data should be useful. My >> understanding is that MI (multiple imputation) is regarded as a generally >> useful approach, and there are many R packages that can do various flavors >> of this. Again, you should consult the literature: there are very likely >> nontechnical reviews of this topic, too, as well as online discussions and >> tutorials. >> >> So if you are serious about dealing with this and have a lot of data with >> these issues, my advice would be to stop looking for ad hoc advice and dig >> into the literature: it's one of the many areas of "data science" where >> seemingly simple but pervasive questions require complex answers. >> >> And, again, heed my personal caveats. >> >> Thus endeth my rant. >> >> Cheers to all, >> Bert >> >> >> >> On Mon, Jan 22, 2024 at 9:29 AM Rich Shepard >> wrote: >> >>> On Mon, 22 Jan 2024, Martin Maechler wrote: >>> >>>> I think it is a good question, not really only about geo-chemistry, but >>>> about statistics in applied sci
Re: [R] Use of geometric mean .. in good data analysis
Dear Martin, Helpful general advice, although it's perhaps worth mentioning that the geometric mean, defined e.g. naively as prod(x)^(1/length(x)), is necessarily 0 if there are any 0 values in x. That is, the geometric mean "works" in this case but isn't really informative. Best, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://www.john-fox.ca/ On 2024-01-22 12:18 p.m., Martin Maechler wrote: Caution: External email. Rich Shepard on Mon, 22 Jan 2024 07:45:31 -0800 (PST) writes: > A statistical question, not specific to R. I'm asking for > a pointer for a source of definitive descriptions of what > types of data are best summarized by the arithmetic, > geometric, and harmonic means. In spite of off-topic: I think it is a good question, not really only about geo-chemistry, but about statistics in applied sciences (and engineering for that matter). Something I sure good applied statisticians in the 1980's and 1990's would all know the answer of : To use the geometric mean instead of the arithmetic mean is basically *equivalent* to first log-transform the data and then work with that transformed data: Not just for computing average, but for more relevant modelling, inference, etc. John W Tukey (and several other of the grands of the time) had the log transform among the "First aid transformations": If the data for a continuous variable must all be positive it is also typically the case that the distribution is considerably skewed to the right. In such a case behave as a good human who sees another human in health distress: apply First Aid -- do the things you learned to do quickly without too much thought, because things must happen fast ---to hopefully save the other's life. Here: Do log transform all such variables with further ado, and only afterwards start your (exploratory and more) data analysis. Now, mean(log(y)) = log(geometricmean(y)), where mean() is the arithmetic mean as in R {mathematically; on the computer you need all.equal(), not '==' !!} I.e., according to Tukey and all the other experienced applied statisticians of the past, the geometric mean is the "best thing" to do for such positive right-skewed data in the same sense that the log-transform is the best "a priori" transformation for such data -- with the one advantage even that you need to fiddle with zeroes when log-transforming, whereas the geometric mean works already for zeroes. Martin > As an aquatic ecologist I see regulators apply the > geometric mean to geochemical concentrations rather than > using the arithmetic mean. I want to know whether the > geometric mean of a set of chemical concentrations (e.g., > in mg/L) is an appropriate representation of the expected > value. If not, I want to explain this to non-technical > decision-makers; if so, I want to understand why my > assumption is wrong. > TIA, > Rich > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and > more, see https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html and provide > commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Is there any design based two proportions z test?
Dear Md Kamruzzaman, I've copied this response to the r-help list, where you originally asked your question. That way, other people can follow the conversation, if they're interested and there will be a record of the solution. Please keep r-help in the loop See below: On 2024-01-17 9:47 p.m., Md. Kamruzzaman wrote: Caution: External email. Dear John Thank you so much for your reply. I have calculated the 95%CI of the separate two proportions by using the survey package. The code is given below. svyby(~Diabetes_Cate, ~Year, nhc, svymean, na=TRUE) Here: nhc is the weighted survey data. I understand your point that it is possible to calculate the 95%CI of the proportional difference manually. It is time consuming, that's why I was looking for a function with a design effect to calculate this easily. I couldn't find this kind of function. However, it will be okay for me to calculate this manually, if there are no functions like this. If you intend to do this computation once, it's not terribly time consuming. If you intend to do it repeatedly, you can write a simple function to do the calculation, probably in less time than it takes to search for one. For manual calculation, could you please share the formula? to calculate the 95%CI of proportional difference. Here's a simple function to compute the confidence interval, assuming that the normal distribution is used. The formula is based on the elementary result that the variance of the difference of two independent random variables is the sum of their variances, plus the observation that the width of the confidence interval is 2*z*SE, where z is the normal quantile corresponding to the confidence level (e.g., 1.96 for a 95% CI). ciDiff <- function(ci1, ci2, level=0.95){ p1 <- mean(ci1) p2 <- mean(ci2) z <- qnorm((1 - level)/2, lower.tail=FALSE) se1 <- (ci1[2] - ci1[1])/(2*z) se2 <- (ci2[2] - ci2[1])/(2*z) seDiff <- sqrt(se1^2 + se2^2) (p1 - p2) + c(-z, z)*seDiff } Example: Prevalence of Diabetes: 2011: 11.0 (95%CI 10.1-11.9) 2017: 10.1 (95%CI 9.4-10.9) Diff: 0.9% (95%CI: ??) These are percentages, not proportions, but you can use either: > ciDiff(c(10.1, 11.9), c(9.4, 10.9)) [1] -0.3215375 2.0215375 > ciDiff(c(.101, .119), c(.094, .109)) [1] -0.003215375 0.020215375 You'll want more significant digits in the inputs to get sufficiently precise results. Since I did this quickly, if I were you I'd check the results manually. Best, John With Kind Regards ----- */Md Kamruzzaman/* On Thu, Jan 18, 2024 at 12:44 AM John Fox <mailto:j...@mcmaster.ca>> wrote: Dear Md Kamruzzaman, To answer your second question first, you could just use the svychisq() function. The difference-of-proportion test is equivalent to a chisquare test for the 2-by-2 table. You don't say how you computed the confidence intervals for the two separate proportions, but if you have their standard errors (and if not, you should be able to infer them from the confidence intervals) you can compute the variance of the difference as the sum of the variances (squared standard errors), because the two proportions are independent, and from that the confidence interval for their difference. I hope this helps, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://www.john-fox.ca/ <https://www.john-fox.ca/> On 2024-01-16 10:21 p.m., Md. Kamruzzaman wrote: > [You don't often get email from mkzama...@gmail.com <mailto:mkzama...@gmail.com>. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification <https://aka.ms/LearnAboutSenderIdentification> ] > > Caution: External email. > > > Hello Everyone, > I was analysing big survey data using survey packages on RStudio. Survey > package allows survey data analysis with the design effect.The survey > package included functions for all other statistical analysis except > two-proportion z tests. > > I was trying to calculate the difference in prevalence of Diabetes and > Prediabetes between the year 2011 and 2017 (with 95%CI). I was able to > calculate the weighted prevalence of diabetes and prediabetes in the Year > 2011 and 2017 and just subtracted the prevalence of 2011 from the > prevalence of 2017 to get the difference in prevalence. But I could not > calculate the 95%CI of the difference in prevalence considering the weight > of the survey data. > >
Re: [R] Is there any design based two proportions z test?
Dear Md Kamruzzaman, To answer your second question first, you could just use the svychisq() function. The difference-of-proportion test is equivalent to a chisquare test for the 2-by-2 table. You don't say how you computed the confidence intervals for the two separate proportions, but if you have their standard errors (and if not, you should be able to infer them from the confidence intervals) you can compute the variance of the difference as the sum of the variances (squared standard errors), because the two proportions are independent, and from that the confidence interval for their difference. I hope this helps, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://www.john-fox.ca/ On 2024-01-16 10:21 p.m., Md. Kamruzzaman wrote: [You don't often get email from mkzama...@gmail.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] Caution: External email. Hello Everyone, I was analysing big survey data using survey packages on RStudio. Survey package allows survey data analysis with the design effect.The survey package included functions for all other statistical analysis except two-proportion z tests. I was trying to calculate the difference in prevalence of Diabetes and Prediabetes between the year 2011 and 2017 (with 95%CI). I was able to calculate the weighted prevalence of diabetes and prediabetes in the Year 2011 and 2017 and just subtracted the prevalence of 2011 from the prevalence of 2017 to get the difference in prevalence. But I could not calculate the 95%CI of the difference in prevalence considering the weight of the survey data. I was also trying to see if this difference in prevalence is statistically significant. I could do it using the simple two-proportion z test without considering the weight of the sample. But I want to do it considering the weight of the sample. Example: Prevalence of Diabetes: 2011: 11.0 (95%CI 10.1-11.9) 2017: 10.1 (95%CI 9.4-10.9) Diff: 0.9% (95%CI: ??) Proportion Z test P Value: ?? Your cooperation will be highly appreciated. Thanks in advance. With Regards ** *Md Kamruzzaman* *PhD **Research Fellow (**Medicine**)* Discipline of Medicine and Centre of Research Excellence in Translating Nutritional Science to Good Health Adelaide Medical School | Faculty of Health and Medical Sciences The University of Adelaide Adelaide SA 5005 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] arrow on contour line
Something like this shodld worx. You will need to fiddle around with the actual co-ordinates etc. I just stuck an arrow in what seemed like a handy place On Wed, 10 Jan 2024 at 19:13, Deepankar Basu wrote: > Hello, > > I am drawing contour lines for a function of 2 variables at one level of > the value of the function and want to include a small arrow in any > direction of increase of the function. Is there some way to do that? > > Below is an example that creates the contour lines. How do I add one small > arrow on each line in the direction of increase of the function (at some > central point of the contour line)? Any direction will do, but perhaps the > direction of the gradient will be the best. > > Thanks in advance. > DB > > > > library(tidyverse) > > x <- seq(1,2,length.out=100) > y <- seq(1,2,length.out=100) > > myf <- function(x,y) {x*y} > myg <- function(x,y) {x^2 + y^2} > > d1 <- expand.grid(X1 = x, X2 = y) %>% > mutate(Z = myf(X1,X2)) %>% > as.data.frame() > > d2 <- expand.grid(X1 = x, X2 = y) %>% > mutate(Z = myg(X1,X2)) %>% > as.data.frame() > > ggplot(data = d1, aes(x=X1,y=X2,z=Z))+ > stat_contour(breaks = c(2)) + > stat_contour(data=d2, aes(x=X1,y=X2,z=Z), breaks=c(6)) > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- John Kane Kingston ON Canada [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Truncated plots
If it looks to be a very specific RStudio/ggplot2 problem then https://community.rstudio.com is probably the place to ask. What happens if she does as Duncan suggests or if she exports the file? Come to think of it, is she getting the same result if she clicks on Zoom in the plot window? The standard plot window in RStudio can distort the actual image due to size restrictions. On Tue, 9 Jan 2024 at 12:47, Ivan Krylov via R-help wrote: > В Tue, 9 Jan 2024 16:42:32 + > Nick Wray пишет: > > > she has a problem with R studio on her laptop > > Does the problem happen with plain R, without Rstudio? > > What's the student's sessionInfo()? > > > I have a screenshot which could email if anyone needs to see what it > > looks like. > > I think that PNG screenshots are allowed on the mailing list, so it > could be very helpful if you attached an appropriately cropped > screenshot. > > -- > Best regards, > Ivan > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- John Kane Kingston ON Canada [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Amelia. Imputation of time-series data
Colleagues, I have started working with Amelia, with the aim of imputing missing data for time-series data. Although I have succeeded in getting Amelia to perform the imputation, I have not found any documentation describing how Amelia imputes time-series data. I have read the basic Amelia documentation, but it does not address how time-series data are imputed. The documentation describes general imputation where there is no serial auto correlation of repeated observations from the same subject. Does Amelia incorporate the serial autocorrelation in the imputation procedure? Can someone direct me to documentation that explains the imputation method? Thank you, John John David Sorkin M.D., Ph.D. Professor of Medicine, University of Maryland School of Medicine; Associate Director for Biostatistics and Informatics, Baltimore VA Medical Center Geriatrics Research, Education, and Clinical Center; PI Biostatistics and Informatics Core, University of Maryland School of Medicine Claude D. Pepper Older Americans Independence Center; Senior Statistician University of Maryland Center for Vascular Research; Division of Gerontology and Paliative Care, 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 Cell phone 443-418-5382 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Obtaining a value of pie in a zero inflated model (fm-zinb2)
I am running a zero inflated regression using the zeroinfl function similar to the model below: fm_zinb2 <- zeroinfl(art ~ . | ., data = bioChemists, dist = "poisson") summary(fm_zinb2) I have three questions: 1) How can I obtain a value for the parameter pie, which is the fraction of the population that is in the zero inflated model vs the fraction in the count model? 2) For any particular subject, how can I determine if the subject is in the portion of the population that contributes a zero count because the subject is in the group of subjects who have structural zero responses vs. the subject being in the portion of the population who can contribute a zero or a non-zero response? 3) zero inflated models can be solved using closed form solutions, or using iterative methods. Which method is used by fm_zinb2? Thank you, John John David Sorkin M.D., Ph.D. Professor of Medicine, University of Maryland School of Medicine; Associate Director for Biostatistics and Informatics, Baltimore VA Medical Center Geriatrics Research, Education, and Clinical Center; PI Biostatistics and Informatics Core, University of Maryland School of Medicine Claude D. Pepper Older Americans Independence Center; Senior Statistician University of Maryland Center for Vascular Research; Division of Gerontology and Paliative Care, 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 Cell phone 443-418-5382 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Advice on starting to analyze smokestack emissions?
Kevin, I would like to be in touch with you. I am pursuing a research project similar to yours. Perhaps we can help each other. John jsor...@som.umaryland.edu John David Sorkin M.D., Ph.D. Professor of Medicine, University of Maryland School of Medicine; Associate Director for Biostatistics and Informatics, Baltimore VA Medical Center Geriatrics Research, Education, and Clinical Center; PI Biostatistics and Informatics Core, University of Maryland School of Medicine Claude D. Pepper Older Americans Independence Center; Senior Statistician University of Maryland Center for Vascular Research; Division of Gerontology and Paliative Care, 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 Cell phone 443-418-5382 From: R-help on behalf of Kevin Zembower via R-help Sent: Saturday, December 16, 2023 2:06 PM To: R-help email list Subject: Re: [R] Advice on starting to analyze smokestack emissions? Just to follow up on this thread, I didn't experience any problems accessing the air monitoring data with the RAQSAPI package that I anticipated from the US EPA's Air Quality System (AQS) Data Mart database website. I didn't have to qualify with an agency affiliation at all, just an email address. Thanks again, Karl, for suggesting this. -Kevin On Fri, 2023-12-15 at 08:29 -0500, Kevin Zembower wrote: > Bert, Tim, Karl and Richard, thank you all for your suggestions and > help. > > I will try the R-sig-ecology list. > > Karl, I wasn't aware of the RAQSAPI package, but it looked promising. > However, when I went to the source of the data it uses, the United > States Environmental Protection Agency’s (US EPA) Air Quality System > (AQS) Data Mart database, it looks like interactive access to the > data > is restricted to those who can document a professional agency > affiliation. I don't have that. I'll work with the package to see if > this is true regarding obtaining the data through it. Thanks for the > suggestion. > > Richard, the Canada study of crematoriums was very useful. Thanks. > > Thanks, again, all, for your help. > > -Kevin __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Convert two-dimensional array into a three-dimensional array.
Colleagues I want to convert a 10x2 array: # create a 10x2 matrix. datavals <- matrix(nrow=10,ncol=2) datavals[,] <- rep(c(1,2),10)+c(rnorm(10),rnorm(10)) datavals into a 10x3 array, ThreeDArray, dim(10,2,10). The values storede in ThreeDArray's first dimensions will be the data stored in datavalues. ThreeDArray[i,,] <- datavals[i,] The values storede in ThreeDArray's second dimensions will be the data stored in datavalues. ThreeDArray[,j,] <- datavals[,j] The data stored in ThreeDArray[,,1] will be 1, The data stored in ThreeDArray[,,2] will be 2. . . . The data stored in ThreeDArray[,,10] will be 10. I have no idea how to code the coversion of the 10x2 matrix into a 10,2,10 array. I may be able to acomplish my mission by coding each line of the plan described above, but there has to be a more efficient and elegant way to accompish my goal. Many thanks for your help! John John David Sorkin M.D., Ph.D. Professor of Medicine, University of Maryland School of Medicine; Associate Director for Biostatistics and Informatics, Baltimore VA Medical Center Geriatrics Research, Education, and Clinical Center; PI Biostatistics and Informatics Core, University of Maryland School of Medicine Claude D. Pepper Older Americans Independence Center; Senior Statistician University of Maryland Center for Vascular Research; Division of Gerontology and Paliative Care, 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 Cell phone 443-418-5382 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Convert character date time to R date-time variable.
Colleagues, I have a matrix of character data that represents date and time. The format of each element of the matrix is "2020-09-17_00:00:00" How can I convert the elements into a valid R date-time constant? Thank you, John John David Sorkin M.D., Ph.D. Professor of Medicine, University of Maryland School of Medicine; Associate Director for Biostatistics and Informatics, Baltimore VA Medical Center Geriatrics Research, Education, and Clinical Center; PI Biostatistics and Informatics Core, University of Maryland School of Medicine Claude D. Pepper Older Americans Independence Center; Senior Statistician University of Maryland Center for Vascular Research; Division of Gerontology and Paliative Care, 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 Cell phone 443-418-5382 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] adding "Page X of XX" to PDFs
https://community.rstudio.com/t/total-number-of-pages-in-quarto-pdf/177316/2 On Sat, 2 Dec 2023 at 09:39, Dennis Fisher wrote: > OS X > R 4.3.1 > > Colleagues > > I often create multipage PDFs [pdf()] in which the text "Page X" appears > in the margin. These PDFs are created automatically using a massive R > script. > > One of my clients requested that I change this to: > Page X of XX > where XX is the total number of pages. > > I don't know the number of expected pages so I can't think of any clever > way to do this. I suppose that I could create the PDF, find out the number > of pages, then have a second pass in which the R script was fed the number > of pages. However, there is one disadvantage to this -- the original PDF > contains a timestamp on each page -- the new version would have a different > timestamp -- so I would prefer to not use this approach. > > Has anyone thought of some terribly clever way to solve this problem? > > Dennis > > Dennis Fisher MD > P < (The "P Less Than" Company) > Phone / Fax: 1-866-PLessThan (1-866-753-7784) > www.PLessThan.com > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- John Kane Kingston ON Canada [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot adjust two y-axis
https://r-graph-gallery.com/line-chart-dual-Y-axis-ggplot2.html On Fri, 24 Nov 2023 at 12:08, Charles-Édouard Giguère wrote: > Hi, > Just find a scaling factor that would make the two sets of data comparable. > Here I divided the second row by 5 and did the same for the second axis. > Charles-Édouard > F1 <- as.table(matrix(c(50,11,6,17,16,3,1,2237,611,403,240,280,0,0), 2,7)) > barplot(F1, beside = TRUE, col = c("blue", "grey")) axis(2, > at=c(0,10,20,30,40,50,60, labels=c(0,10,20,30,40,50,60))) axis(4, at = > c(0,500,1000,1500,2000,2500), labels = > c(0,500,1000,1500,2000,2500)) > > -Message d'origine- > De : sibylle.stoec...@gmx.ch > Envoyé : 24 novembre 2023 11:27 > À : 'Charles-Édouard Giguère' ; r-help@r-project.org > Objet : RE: [R] ggplot adjust two y-axis > > Dear Charles-Edouard > > Thanks a lot. Yes indeed barplot sounds excellent. > > Unfortunately, the scale of the smaller axis is fixed, even If I am able to > draw to axes. The idea is to expand the scale to the scale to the second > axis for comparison. > F1 <- as.table(matrix(c(50,11,6,17,16,3,1,2237,611,403,240,280,0,0), 2,7)) > barplot(F1, beside = TRUE, col = c("blue", "grey")) axis(2, > at=c(0,10,20,30,40,50,60, labels=c(0,10,20,30,40,50,60))) axis(4, at = > c(0,500,1000,1500,2000,2500), labels = > c(0,500,1000,1500,2000,2500)) > > Kind regards > Sibylle > > > -Original Message- > From: Charles-Édouard Giguère > Sent: Friday, November 24, 2023 3:57 PM > To: sibylle.stoec...@gmx.ch; r-help@r-project.org > Subject: RE: [R] ggplot adjust two y-axis > > Hi, > I don't know the axis mecanism well enough in ggplot but using the > original > barplot function you can add an axis on the right using the axis function. > > Here is an example: > > test <- as.table(matrix(c(2,10,3,11), 2,2)) barplot(test, beside = TRUE, > col > = scales::brewer_pal(palette = 1)(2)) axis(4, at = c(0, 5, 10), labels = > c(0,50,100)) > > > -Message d'origine- > De : sibylle.stoec...@gmx.ch Envoyé : 24 > novembre > 2023 09:27 À : 'Charles-Édouard Giguère' ; > r-help@r-project.org Objet : RE: [R] ggplot adjust two y-axis > > Dear Charles-Edouard > > Thanks a lot. > So no way in R to just simply have one ggplot with to axis as in Excel > (attachment)? > > Kind regards > Sibylle > > -Original Message- > From: Charles-Édouard Giguère > Sent: Friday, November 24, 2023 3:14 PM > To: sibylle.stoec...@gmx.ch; r-help@r-project.org > Subject: RE: [R] ggplot adjust two y-axis > > You could also use more simply facet_wrap(~ Studien_Flaeche). > Charles-Édouard > > -Message d'origine- > De : Charles-Édouard Giguère Envoyé : 24 novembre > 2023 09:11 À : sibylle.stoec...@gmx.ch; r-help@r-project.org Objet : RE: > [R] > ggplot adjust two y-axis > > Hi Sibylle, > For that kind of data with two different scales, I generally use two graphs > that I name gg1 and gg2 and join them using gridExtra::grid.arrange(gg1, > gg2). This way, the red part of your graph is easier to interpret. > Have a nice day, > Charles-Édouard > > -Message d'origine- > De : R-help De la part de > sibylle.stoec...@gmx.ch Envoyé : 24 novembre 2023 05:52 À : > r-help@r-project.org Objet : [R] ggplot adjust two y-axis > > Dear R-users > > Is it possible to adjust two y-axis in a ggplot differently? > - First y axis (0-60) > - Second y axis (0-2500) > > > ### Figure 1 > ggplot(Fig1,aes(BFF,Wert,fill=Studien_Flaeche))+ > geom_bar(stat="identity",position='dodge')+ > scale_y_continuous(name="First Axis", sec.axis=sec_axis(trans=~.*50, > name="Second Axis"))+ > scale_fill_brewer(palette="Set1") > > Thanks a lot > Sibylle > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- John Kane Kingston ON Canada [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can someone please have a look at this query on stackoverflow?
I ran the code from the answer and it seems to work well. It, definitely, is giving a landscape output. --- title: "Testing landscape and aspect ratio" output: pdf_document: number_sections: true classoption: - landscape - "aspectratio=169" header-includes: - \usepackage{dcolumn} documentclass: article geometry: margin=1.5cm--- ```{r, out.extra='keepaspectratio=true', out.height='100%', out.width="100%"} plot(rnorm(100)) ``` On Mon, 13 Nov 2023 at 23:33, Ashim Kapoor wrote: > Dear all, > > I have posted a query which has received a response but that is not > working on my computer. > > Here is the query: > > > https://stackoverflow.com/questions/77387434/pdf-from-rmarkdown-landscape-and-aspectratio-169 > > Can someone please help me ? > > Best Regards, > Ashim > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- John Kane Kingston ON Canada landscape.pdf Description: Adobe PDF document __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error running gee function. I neither understand the error message, nor know what needs to be done the get the gee to run
Colleagues, I am receiving several error messages from the gee function. I don't understand the ides the error messages are trying to impart, and I don't know how to debug or correct the error. The error messages follow: > fitgee <- gee(HipFlex ~ > StepHeight,data=datashort,id=PID,corstr="exchangeable",na.action=na.omit) Beginning Cgee S-function, @(#) geeformula.q 4.13 98/01/27 running glm to get initial regression estimate (Intercept) StepHeight 1.400319 58.570236 Error in gee(HipFlex ~ StepHeight, data = datashort, id = PID, corstr = "exchangeable", : NA/NaN/Inf in foreign function call (arg 3) In addition: Warning message: In gee(HipFlex ~ StepHeight, data = datashort, id = PID, corstr = "exchangeable", : NAs introduced by coercion Of note, when the analysis is run using lm, there is no problem. My fully data and code follow: Thank you, John CODE: if (!require(gee)) {install.packages("gee")} library(gee) datashort <- structure(list(HipFlex = c(1.95, 2.07, 1.55, 0.44, 0.23, 0.41, 0.22, 4.61, 10.02, 1.08, 1.43, 1.82, 0.34, 0.77, 0.22, 1.06, 0.13, 0.36, 2.84, 5.2, 12.27, 1.37, 2.33, 3.48, 4.76, 1.92, 2.09, 4.67, 2.94, 0.75, 0.11, 3.56, 1.63, 0.8, 1.54, 5.06, NA,5.41, 6.18, 3.75, 3.12, 17.43, 3.18, 0.85, 14.54, 14.34, 21.92, 4.91, 1.52, 0.38, 0.43, 0.47, 0.56, 6.4, 12.4, 3.98, 0.57, 1.84, 12.06, 0.45, 8.16, 0.02, 0,0.05, 0.52, 0.11, 0.48, 1.5, 3.29, 2.58, 2.07, 6.06, 1.46, 1.06, 3.82, 1.09, 2.86, 3.47, 2.22, 1.89, NA, 3.48, 6.38, 3.58, 1.83, 2.8, 8.28, 7.15, 4.77, 4.93, 0, 0.11, 1.99, 2.01, 2.3, 1.24, 1.33, 2, 1.01), PID = c("HIPS004", "HIPS004", "HIPS005", "HIPS005", "HIPS005", "HIPS006", "HIPS006", "HIPS008", "HIPS010", "HIPS024", "HIPS024", "HIPS024", "HIPS025", "HIPS028", "HIPS028", "HIPS030", "HIPS030", "HIPS030", "HIPS035", "HIPS035", "HIPS035", "HIPS036", "HIPS036", "HIPS037", "HIPS044", "HIPS047", "HIPS047", "HIPS056", "HIPS056", "HIPS057", "HIPS057", "HIPS057", "HIPS058", "HIPS059", "HIPS059", "HIPS061", "HIPS062", "HIPS062", "HIPS062", "HIPS064", "HIPS074", "HIPS079", "HIPS084", "HIPS089", "HIPS090", "HIPS090", "HIPS090", "HIPS091", "HIPS091", "HIPS092", "HIPS092", "HIPS092", "HIPS001", "HIPS001", "HIPS001", "HIPS004", "HIPS004", "HIPS004", "HIPS005", "HIPS005", "HIPS005", "HIPS006", "HIPS006", "HIPS008", "HIPS022", "HIPS024", "HIPS028", "HIPS030", "HIPS035", "HIPS036", "HIPS036", "HIPS039", "HIPS044", "HIPS047", "HIPS051", "HIPS056", "HIPS058", "HIPS058", "HIPS059", "HIPS059", "HIPS062", "HIPS062", "HIPS062", "HIPS069", "HIPS069", "HIPS071", "HIPS074", "HIPS079", "HIPS084", "HIPS084", "HIPS085", "HIPS089", "HIPS090", "HIPS091", "HIPS091", "HIPS091", "HIPS092", "HIPS092", "HIPS093"), StepHeight = c(0.005, 0.008, 0.072, 0.003, 0.014, 0.01, 0.027, 0.074, 0.128, 0.048, 0.036, 0.024, 0.021, 0.026, 0.03, 0.004, 0.006, 0.006, 0.011, 0.006, 0.053, 0.028, 0.073, 0.041, 0.005, 0.007, 0.013, 0.012, 0.021, 0.053, 0.013, 0.071, 0.012, 0.016, 0.023, 0.024, 0.011, 0.019, 0.014, 0.022, 0.011, 0.129, 0.03, 0.012, 0.062, 0.145, 0.077, 0.028, 0.006, 0.019, 0.008, 0.006, 0.034, 0.109, 0.09, 0.005, 0.016, 0.005, 0.257, 0.011, 0.205, 0.01, 0.017, 0.039, 0.01, 0.016, 0.043, 0.004, 0.008, 0.04, 0.068, 0.006, 0.008, 0.005, 0.097, 0.015, 0.016, 0.01, 0.021, 0.008, 0.01, 0.006, 0.016, 0.021, 0.012, 0.009, 0.032, 0.055, 0.006, 0.066, 0.018, 0.01, 0.018, 0.017, 0.015, 0.01, 0.017, 0.02, 0.022)), class = "data.frame", row.names = c(4L, 5L, 6L,7L, 8L, 10L, 12L, 14L, 19L, 29L, 30L, 31L, 33L, 41L, 43L, 44L, 45L, 46L, 47L, 48L, 51L, 52L, 53L, 58L, 62L, 65L, 67L, 70L, 72L, 74L, 75L, 77L, 79L, 82L, 83L, 86L, 88L, 89L, 90L, 93L, 109L, 114L, 117L, 129L, 131L, 132L, 133L, 134L, 135L, 136L, 137L, 138L, 142L, 143L, 144L, 145L, 146L, 147L, 148L, 149L, 150L, 151L, 152L, 155L, 165L
[R] by function does not separate output from function with mulliple parts
Colleagues, I have written an R function (see fully annotated code below), with which I want to process a dataframe within levels of the variable StepType. My program works, it processes the data within levels of StepType, but the usual headers that separate the output by levels of StepType are at the end of the listing rather than being used as separators, i.e. I get Regression results StepType First Contrast results StepType First Regression results StepType Second Contrast results StepType Second and only after the results are displayed do I get the usual separators: mydata$StepType: First NULL -- mydata$StepType: Second NULL What I want to get is output that includes the separators i.e., mydata$StepType: First Regression results StepType First Contrast results StepType First -- mydata$StepType: Second Regression results StepType Second Contrast results StepType Second Can you help me get the separators included in the printed otput? Thank you, John # Create Dataframe # mydata <- structure(list(HipFlex = c(19.44, 4.44, 3.71, 1.95, 2.07, 1.55, 0.44, 0.23, 2.15, 0.41, 2.3, 0.22, 2.08, 4.61, 4.19, 5.65, 2.73, 1.46, 10.02, 7.41, 6.91, 5.28, 9.56, 2.46, 6, 3.85, 6.43, 3.73, 1.08, 1.43, 1.82, 2.22, 0.34, 5.11, 0.94, 0.98, 2.04, 1.73, 0.94, 18.41, 0.77, 2.31, 0.22, 1.06, 0.13, 0.36, 2.84, 5.2, 2.39, 2.99), jSex = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L), levels = c("Male", "Female"), class = "factor")), row.names = c(NA, 50L), class = "data.frame") mydata[,"StepType"] <- rep(c("First","Second"),25) mydata # END Create Dataframe # # Define function to be run# DoReg <- function(x){ fit0<-lm(as.numeric(HipFlex) ~ jSex,data=x) print(summary(fit0)) cat("\nMale\n") print(contrast(fit0, list(jSex="Male"))) cat("\nFemale\n") print(contrast(fit0, list(jSex="Female"))) cat("\nDifference\n") print(contrast(fit0, a=list(jSex="Male"), b=list(jSex="Female"))) } # END Define function to be run# # # Run function within levels of Steptype# # by(mydata,mydata$StepType,DoReg) # # END Run function within levels of Steptype# # John David Sorkin M.D., Ph.D. Professor of Medicine, University of Maryland School of Medicine; Associate Director for Biostatistics and Informatics, Baltimore VA Medical Center Geriatrics Research, Education, and Clinical Center; PI Biostatistics and Informatics Core, University of Maryland School of Medicine Claude D. Pepper Older Americans Independence Center; Senior Statistician University of Maryland Center for Vascular Research; Division of Gerontology and Paliative Care, 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 Cell phone 443-418-5382 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] download.file strict certificate revocation check
Ivan, SSL connect error & we definitely have MITM doing certificate interference. No change with True or False with R_LIBCURL_SSL_REVOKE_BEST_EFFORT Environment variable results should be attached. -Original Message- From: Ivan Krylov Sent: Wednesday, October 4, 2023 8:52 AM To: John Neset Cc: r-help@R-project.org Subject: Re: [R] download.file strict certificate revocation check WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe. В Wed, 4 Oct 2023 13:09:47 +0000 John Neset пишет: > Trying to do this, reference FAQ- > 2.18 The Internet download functions fail. > (c) A MITM proxy (typically in enterprise environments) makes it > impossible to validate that certificates haven't been revoked. One can > switch to only best effort revocation checks via an environment > variable: see ?download.file. Here's what help(download.file) has to say: >> On Windows with ‘method = "libcurl"’, when R was linked with >> ‘libcurl’ with ‘Schannel’ enabled, the connection fails if it >> cannot be established that the certificate has not been revoked. >> Some MITM proxies present particularly in corporate environments >> do not work with this behavior. It can be changed by setting >> environment variable ‘R_LIBCURL_SSL_REVOKE_BEST_EFFORT’ to >> ‘TRUE’, with the consequence of reducing security. Does it help to Sys.setenv(...) this environment variable before downloading? If not, please provide your sessionInfo() and the full error message. -- Best regards, Ivan Confidentiality Notice - This communication and any attachments are for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure, distribution or copying is prohibited. If you are not the intended recipient(s), please contact the sender by replying to this e-mail and destroy/delete all copies of this e-mail message. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] download.file strict certificate revocation check
What/how do I interact with the download.file with turning off the strict certificate revocation check in regards to download & update packages? I clearly made an attempt at this, but failed miserably. Trying to do this, reference FAQ- 2.18 The Internet download functions fail. (c) A MITM proxy (typically in enterprise environments) makes it impossible to validate that certificates haven't been revoked. One can switch to only best effort revocation checks via an environment variable: see ?download.file. Confidentiality Notice - This communication and any atta...{{dropped:10}} __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Grouping by Date and showing count of failures by date
To follow up on Rui Barradas's post, I do not think PivotTable is an R command. You may be thinking og the "pivot_longer" and "pivot_wider" functions in the {tidyr} package which is part of {tidyverse}. On Sat, 30 Sept 2023 at 07:03, Rui Barradas wrote: > Às 21:29 de 29/09/2023, Paul Bernal escreveu: > > Dear friends, > > > > Hope you are doing great. I am attaching the dataset I am working with > > because, when I tried to dput() it, I was not able to copy the entire > > result from dput(), so I apologize in advance for that. > > > > I am interested in creating a column named Failure_Date_Period that has > the > > FAILDATE but formatted as _MM. Then I want to count the number of > > failures (given by column WONUM) and just have a dataframe that has the > > FAILDATE and the count of WONUM. > > > > I tried this: > > pt <- PivotTable$new() > > pt$addData(failuredf) > > pt$addColumnDataGroups("FAILDATE") > > pt <- PivotTable$new() > > pt$addData(failuredf) > > pt$addColumnDataGroups("FAILDATE") > > pt$defineCalculation(calculationName = "FailCounts", > > summariseExpression="n()") > > pt$renderPivot() > > > > but I was not successful. Bottom line, I need to create a new dataframe > > that has the number of failures by FAILDATE, but in -MM format. > > > > Any help and/or guidance will be greatly appreciated. > > > > Kind regards, > > Paul > > __ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > Hello, > > No data is attached. Maybe try > > dput(head(failuredf, 30)) > > ? > > And where can we find non-base PivotTable? Please start the scripts with > calls to library() when using non-base functionality. > > Hope this helps, > > Rui Barradas > > > -- > Este e-mail foi analisado pelo software antivírus AVG para verificar a > presença de vírus. > www.avg.com > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- John Kane Kingston ON Canada [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] car::deltaMethod() fails when a particular combination of categorical variables is not present
Dear Michael, My previous response was inaccurate: First, linearHypothesis() *is* able to accommodate aliased coefficients by setting the argument singular.ok = TRUE: > linearHypothesis(minimal_model, "bt2 + csent + bt2:csent = 0", + singular.ok=TRUE) Linear hypothesis test: bt2 + csent + bt2:csent = 0 Model 1: restricted model Model 2: a ~ b * c Res.DfRSS Df Sum of Sq F Pr(>F) 1 16 9392.1 2 15 9266.4 1125.67 0.2034 0.6584 Moreover, when there is an empty cell, this F-test is (for a reason that I haven't worked out, but is almost surely due to how the rank-deficient model is parametrized) *not* equivalent to the t-test for the corresponding coefficient in the raveled version of the two factors: > df$bc <- factor(with(df, paste(b, c, sep=":"))) > m <- lm(a ~ bc, data=df) > summary(m) Call: lm(formula = a ~ bc, data = df) Residuals: Min 1Q Median 3Q Max -57.455 -11.750 0.439 14.011 37.545 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept)20.50 17.57 1.166 0.2617 bct1:unsent37.50 24.85 1.509 0.1521 bct2:other 32.00 24.85 1.287 0.2174 bct2:sent 17.17 22.69 0.757 0.4610 <<< cf. F = 0.2034, p = 0.6584 bct2:unsent38.95 19.11 2.039 0.0595 Residual standard error: 24.85 on 15 degrees of freedom Multiple R-squared: 0.2613,Adjusted R-squared: 0.06437 F-statistic: 1.327 on 4 and 15 DF, p-value: 0.3052 In the full-rank case, however, what I said is correct -- that is, the F-test for the 1 df hypothesis on the three coefficients is equivalent to the t-test for the corresponding coefficient when the two factors are raveled: > linearHypothesis(minimal_model_fixed, "bt2 + csent + bt2:csent = 0") Linear hypothesis test: bt2 + csent + bt2:csent = 0 Model 1: restricted model Model 2: a ~ b * c Res.DfRSS Df Sum of Sq F Pr(>F) 1 15 9714.5 2 14 9194.4 1520.08 0.7919 0.3886 > df_fixed$bc <- factor(with(df_fixed, paste(b, c, sep=":"))) > m <- lm(a ~ bc, data=df_fixed) > summary(m) Call: lm(formula = a ~ bc, data = df_fixed) Residuals: Min 1Q Median 3Q Max -57.455 -11.750 0.167 14.011 37.545 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 64.000 25.627 2.497 0.0256 bct1:sent-43.500 31.387 -1.386 0.1874 bct1:unsent -12.000 36.242 -0.331 0.7455 bct2:other -11.500 31.387 -0.366 0.7195 bct2:sent-26.333 29.591 -0.890 0.3886 << cf. bct2:unsent -4.545 26.767 -0.170 0.8676 Residual standard error: 25.63 on 14 degrees of freedom Multiple R-squared: 0.2671,Adjusted R-squared: 0.005328 F-statistic: 1.02 on 5 and 14 DF, p-value: 0.4425 So, to summarize: (1) You can use linearHypothesis() with singular.ok=TRUE to test the hypothesis that you specified, though I suspect that this hypothesis probably isn't testing what you think in the rank-deficient case. I suspect that the hypothesis that you want to test is obtained by raveling the two factors. (2) There is no reason to use deltaMethod() for a linear hypothesis, but there is also no intrinsic reason that deltaMethod() shouldn't be able to handle a rank-deficient model. We'll probably fix that. My apologies for the confusion, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://www.john-fox.ca/ On 2023-09-26 9:49 a.m., John Fox wrote: Caution: External email. Dear Michael, You're testing a linear hypothesis, so there's no need to use the delta method, but the linearHypothesis() function in the car package also fails in your case: > linearHypothesis(minimal_model, "bt2 + csent + bt2:csent = 0") Error in linearHypothesis.lm(minimal_model, "bt2 + csent + bt2:csent = 0") : there are aliased coefficients in the model. One work-around is to ravel the two factors into a single factor with 5 levels: > df$bc <- factor(with(df, paste(b, c, sep=":"))) > df$bc [1] t2:unsent t2:unsent t2:unsent t2:unsent t2:sent t2:unsent [7] t2:unsent t1:sent t2:unsent t2:unsent t2:other t2:unsent [13] t1:unsent t1:sent t2:unsent t2:other t1:unsent t2:sent [19] t2:sent t2:unsent Levels: t1:sent t1:unsent t2:other t2:sent t2:unsent > m <- lm(a ~ bc, data=df) > summary(m) Call: lm(formula = a ~ bc, data = df) Residuals: Min 1Q Median 3Q Max -57.455 -11.750 0.439 14.011 37.545 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 20.50 17.57 1.166 0.2617 bct1:unsent 37.50 24.85 1.509 0.1521 bct2:other 32.00 24.85 1.287 0.2174 bct2:sent 17.17 22.69 0.757 0.4610 bct2:unsent 38.95 19.11 2.039 0.0595 Residual sta
Re: [R] car::deltaMethod() fails when a particular combination of categorical variables is not present
Dear Michael, You're testing a linear hypothesis, so there's no need to use the delta method, but the linearHypothesis() function in the car package also fails in your case: > linearHypothesis(minimal_model, "bt2 + csent + bt2:csent = 0") Error in linearHypothesis.lm(minimal_model, "bt2 + csent + bt2:csent = 0") : there are aliased coefficients in the model. One work-around is to ravel the two factors into a single factor with 5 levels: > df$bc <- factor(with(df, paste(b, c, sep=":"))) > df$bc [1] t2:unsent t2:unsent t2:unsent t2:unsent t2:sent t2:unsent [7] t2:unsent t1:sent t2:unsent t2:unsent t2:other t2:unsent [13] t1:unsent t1:sent t2:unsent t2:other t1:unsent t2:sent [19] t2:sent t2:unsent Levels: t1:sent t1:unsent t2:other t2:sent t2:unsent > m <- lm(a ~ bc, data=df) > summary(m) Call: lm(formula = a ~ bc, data = df) Residuals: Min 1Q Median 3Q Max -57.455 -11.750 0.439 14.011 37.545 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept)20.50 17.57 1.166 0.2617 bct1:unsent37.50 24.85 1.509 0.1521 bct2:other 32.00 24.85 1.287 0.2174 bct2:sent 17.17 22.69 0.757 0.4610 bct2:unsent38.95 19.11 2.039 0.0595 Residual standard error: 24.85 on 15 degrees of freedom Multiple R-squared: 0.2613,Adjusted R-squared: 0.06437 F-statistic: 1.327 on 4 and 15 DF, p-value: 0.3052 Then the hypothesis is tested directly by the t-value for the coefficient bct2:sent. I hope that this helps, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://www.john-fox.ca/ On 2023-09-26 1:12 a.m., Michael Cohn wrote: Caution: External email. I'm running a linear regression with two categorical predictors and their interaction. One combination of levels does not occur in the data, and as expected, no parameter is estimated for it. I now want to significance test a particular combination of levels that does occur in the data (ie, I want to get a confidence interval for the total prediction at given levels of each variable). In the past I've done this using car::deltaMethod() but in this dataset that does not work, as shown in the example below: The regression model gives the expected output, but deltaMethod() gives this error: error in t(gd) %*% vcov. : non-conformable arguments I believe this is because there is no parameter estimate for when the predictors have the values 't1' and 'other'. In the df_fixed dataframe, putting one person into that combination of categories causes deltaMethod() to work as expected. I don't know of any theoretical reason that missing one interaction parameter estimate should prevent getting a confidence interval for a different combination of predictors. Is there a way to use deltaMethod() or some other function to do this without changing my data? Thank you, - Michael Cohn Vote Rev (http://voterev.org) Demonstration: -- library(car) # create dataset with outcome and two categorical predictors outcomes <- c(91,2,60,53,38,78,48,33,97,41,64,84,64,8,66,41,52,18,57,34) persontype <- c("t2","t2","t2","t2","t2","t2","t2","t1","t2","t2","t2","t2","t1","t1","t2","t2","t1","t2","t2","t2") arm_letter <- c("unsent","unsent","unsent","unsent","sent","unsent","unsent","sent","unsent","unsent","other","unsent","unsent","sent","unsent","other","unsent","sent","sent","unsent") df <- data.frame(a = outcomes, b=persontype, c=arm_letter) # note: there are no records with the combination 't1' + 'other' table(df$b,df$c) #regression works as expected minimal_formula <- formula("a ~ b*c") minimal_model <- lm(minimal_formula, data=df) summary(minimal_model) #use deltaMethod() to get a prediction for individuals with the combination 'b2' and 'sent' # deltaMethod() fails with "error in t(gd) %*% vcov. : non-conformable arguments." deltaMethod(minimal_model, "bt2 + csent + `bt2:csent`", rhs=0) # duplicate the dataset and change one record to be in the previously empty cell df_fixed <- df df_fixed[c(13),"c"] <- 'other' table(df_fixed$b,df_fixed$c) #deltaMethod() now works minimal_model_fixed <- lm(minimal_formula, data=df_fixed) deltaMethod(minimal_model_fixed, "bt2 + csent + `bt2:csent`", rhs=0) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Print hypothesis warning- Car package
Hi Peter, On 2023-09-18 10:08 a.m., peter dalgaard wrote: Caution: External email. Also, I would guess that the code precedes the use of backticks in non-syntactic names. Indeed, by more than a decade (though modified in the interim). Could they be deployed here? I don't think so, at least not without changing how the function works. The problem doesn't occur when the hypothesis is specified symbolically as a character vector, including in equation form, only when the hypothesis matrix is given directly, in which case linearHypothesis() tries to construct the equation-form representation, again as character vectors. Its inability to do so when the coefficient names include arithmetic operators doesn't, I think, require a warning or even a message: the symbolic representation of the hypothesis can simply be omitted. The numeric results reported are entirely unaffected. I've made this change and will commit it to the next version of the car package. Thank you for the suggestion, John - Peter On 17 Sep 2023, at 16:43 , John Fox wrote: Dear Robert, Anova() calls linearHypothesis(), also in the car package, to compute sums of squares and df, supplying appropriate hypothesis matrices. linearHypothesis() usually tries to express the hypothesis matrix in symbolic equation form for printing, but won't do this if coefficient names include arithmetic operators, in your case - and +, which can confuse it. The symbolic form of the hypothesis isn't really relevant for Anova(), which doesn't use the printed representation of each hypothesis, and so, despite the warnings, you get the correct ANOVA table. In your case, where the data are balanced, with 4 cases per cell, Anova(mod) and summary(mod) are equivalent, which makes me wonder why you would use Anova() in the first place. To elaborate a bit, linearHypothesis() does tolerate arithmetic operators in coefficient names if you specify the hypothesis symbolically rather than as a hypothesis matrix. For example, to test, the interaction: --- snip linearHypothesis(mod, + c("TreatmentDabrafenib:ExpressionCD271+ = 0", +"TreatmentTrametinib:ExpressionCD271+ = 0", +"TreatmentCombination:ExpressionCD271+ = 0")) Linear hypothesis test Hypothesis: TreatmentDabrafenib:ExpressionCD271+ = 0 TreatmentTrametinib:ExpressionCD271+ = 0 TreatmentCombination:ExpressionCD271+ = 0 Model 1: restricted model Model 2: Viability ~ Treatment * Expression Res.Df RSS Df Sum of Sq F Pr(>F) 1 27 18966 2 24 16739 32226.3 1.064 0.3828 --- snip Alternatively: --- snip H <- matrix(0, 3, 8) H[1, 6] <- H[2, 7] <- H[3, 8] <- 1 H [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [1,]00000100 [2,]00000010 [3,]00000001 linearHypothesis(mod, H) Linear hypothesis test Hypothesis: Model 1: restricted model Model 2: Viability ~ Treatment * Expression Res.Df RSS Df Sum of Sq F Pr(>F) 1 27 18966 2 24 16739 32226.3 1.064 0.3828 Warning message: In printHypothesis(L, rhs, names(b)) : one or more coefficients in the hypothesis include arithmetic operators in their names; the printed representation of the hypothesis will be omitted --- snip There's no good reason that linearHypothesis() should try to express each hypothesis symbolically for Anova(), since Anova() doesn't use that information. When I have some time, I'll arrange to avoid the warning. Best, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://www.john-fox.ca/ On 2023-09-16 4:39 p.m., Robert Baer wrote: Caution: External email. When doing Anova using the car package, I get a print warning that is unexpected. It seemingly involves have my flow cytometry factor levels named CD271+ and CD171-. But I am not sure this warning should be intended behavior. Any explanation about whether I'm doing something wrong? Why can't I have CD271+ and CD271- as factor levels? Its legal text isn't it? library(car) mod = aov(Viability ~ Treatment*Expression, data = dat1) Anova(mod, type =2) Anova Table (Type II tests) Response: Viability Sum Sq Df F value Pr(>F) Treatment 19447.3 3 9.2942 0.0002927 *** Expression 2669.8 1 3.8279 0.0621394 . Treatment:Expression 2226.3 3 1.0640 0.3828336 Residuals 16739.3 24 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Warning messages: 1: In printHypothesis(L, rhs, names(b)) : one or more coefficients in the hypothesis include arithmetic operators in their names; the printed representation of the hypothesis will be omitted 2: In printHypothesis(L, rhs, names(b)) : one or more coefficients in the hypothesis include arithmetic operators in the
Re: [R] Print hypothesis warning- Car package
Dear Robert, Anova() calls linearHypothesis(), also in the car package, to compute sums of squares and df, supplying appropriate hypothesis matrices. linearHypothesis() usually tries to express the hypothesis matrix in symbolic equation form for printing, but won't do this if coefficient names include arithmetic operators, in your case - and +, which can confuse it. The symbolic form of the hypothesis isn't really relevant for Anova(), which doesn't use the printed representation of each hypothesis, and so, despite the warnings, you get the correct ANOVA table. In your case, where the data are balanced, with 4 cases per cell, Anova(mod) and summary(mod) are equivalent, which makes me wonder why you would use Anova() in the first place. To elaborate a bit, linearHypothesis() does tolerate arithmetic operators in coefficient names if you specify the hypothesis symbolically rather than as a hypothesis matrix. For example, to test, the interaction: --- snip > linearHypothesis(mod, + c("TreatmentDabrafenib:ExpressionCD271+ = 0", +"TreatmentTrametinib:ExpressionCD271+ = 0", +"TreatmentCombination:ExpressionCD271+ = 0")) Linear hypothesis test Hypothesis: TreatmentDabrafenib:ExpressionCD271+ = 0 TreatmentTrametinib:ExpressionCD271+ = 0 TreatmentCombination:ExpressionCD271+ = 0 Model 1: restricted model Model 2: Viability ~ Treatment * Expression Res.Df RSS Df Sum of Sq F Pr(>F) 1 27 18966 2 24 16739 32226.3 1.064 0.3828 --- snip Alternatively: --- snip > H <- matrix(0, 3, 8) > H[1, 6] <- H[2, 7] <- H[3, 8] <- 1 > H [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [1,]00000100 [2,]00000010 [3,]00000001 > linearHypothesis(mod, H) Linear hypothesis test Hypothesis: Model 1: restricted model Model 2: Viability ~ Treatment * Expression Res.Df RSS Df Sum of Sq F Pr(>F) 1 27 18966 2 24 16739 32226.3 1.064 0.3828 Warning message: In printHypothesis(L, rhs, names(b)) : one or more coefficients in the hypothesis include arithmetic operators in their names; the printed representation of the hypothesis will be omitted --- snip There's no good reason that linearHypothesis() should try to express each hypothesis symbolically for Anova(), since Anova() doesn't use that information. When I have some time, I'll arrange to avoid the warning. Best, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://www.john-fox.ca/ On 2023-09-16 4:39 p.m., Robert Baer wrote: Caution: External email. When doing Anova using the car package, I get a print warning that is unexpected. It seemingly involves have my flow cytometry factor levels named CD271+ and CD171-. But I am not sure this warning should be intended behavior. Any explanation about whether I'm doing something wrong? Why can't I have CD271+ and CD271- as factor levels? Its legal text isn't it? library(car) mod = aov(Viability ~ Treatment*Expression, data = dat1) Anova(mod, type =2) Anova Table (Type II tests) Response: Viability Sum Sq Df F value Pr(>F) Treatment 19447.3 3 9.2942 0.0002927 *** Expression 2669.8 1 3.8279 0.0621394 . Treatment:Expression 2226.3 3 1.0640 0.3828336 Residuals 16739.3 24 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Warning messages: 1: In printHypothesis(L, rhs, names(b)) : one or more coefficients in the hypothesis include arithmetic operators in their names; the printed representation of the hypothesis will be omitted 2: In printHypothesis(L, rhs, names(b)) : one or more coefficients in the hypothesis include arithmetic operators in their names; the printed representation of the hypothesis will be omitted 3: In printHypothesis(L, rhs, names(b)) : one or more coefficients in the hypothesis include arithmetic operators in their names; the printed representation of the hypothesis will be omitted The code to reproduce: ``` dat1 <-structure(list(Treatment = structure(c(1L, 1L, 1L, 1L, 3L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L), levels = c("Control", "Dabrafenib", "Trametinib", "Combination"), class = "factor"), Expression = structure(c(2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L), levels = c("CD271-", "CD271+"), class = "factor"),
[R] Theta from negative binomial regression and power_NegativeBinomiial from PASSED
Colleagues, I want to use the power_NetativeBinomial function from the PASSED library. The function requires a value for a parameter theta. The meaning of theta is not given in the documentation (at least I can�t find it) of the function. Further the descriptions of the negative binomial distribution that I am familiar with do not mention theta as being a parameter of the distribution. I noticed that when one runs the glm.nb function to perform a negative binomial regression one obtains a value for theta. This leads to two questions 1. Is the theta required by the power_NetativeBinomial function the theta that is produced by the glm.nb function 2. What is theta, and how does it relate to the parameters of the negative binomial distribution? Thank you, John [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R coding errors
Can you supply us with some sample data? A handy way to supply some sample data is the dput() function. In the case of a large dataset something like dput(head(mydata, 100)) should supply the data we need. Just do dput(mydata) where *mydata* is your data. Copy the output and paste it here. On Fri, 8 Sept 2023 at 10:42, PROFESOR MADYA DR NORHAYATI BAHARUN < norha...@uitm.edu.my> wrote: > Hi Sir, > > Could you please help me on the following errors: > > STEPS TO MIX IRT_RMT APPROACH IN R > > #1- Load required libraries > library(eRm) > library(ltm) > library(mirt) > library(psych) > > HT <- read.csv("C:/Users/User/Dropbox/Analysis R_2023/HT2.csv") > str(HT) > > #2- Load or create your data matrix > response_columns <- HT[, 1:ncol(HT)] > response_matrix <- as.matrix(response_columns) > > #3- Fit IRT model > irt_model <- gpcm(response_matrix) > irt_model > summary(irt_model) > > #4- Fit Rasch model > rasch_model <- PCM(response_matrix) > rasch_model > summary(rasch_model) > > #5- Compare item parameter estimates between IRT and Rasch models > irt_item_parameters <- coef(irt_model) > rasch_item_parameters <- coef(rasch_model) > > #6- Compare person ability estimates between IRT and Rasch models > #TRY1 > irt_person_abilities <- fscores(irt_model)###ERROR### > #TRY2 > #a(IRT)- Fit your GPCM model using ltm > irt_model <- gpcm(data = HT, constraint = "1PL") > > #a(IRT)- Calculate factor scores (IRT person abilities) > irt_person_abilities <- factor.scores(irt_model) ###ERROR### > irt_person_abilities_dim1 <- factor.scores(gpcm_model, f = 1) > > #TRY3 > # Fit your GPCM model using mirt > irt_model <- mirt(data = HT, model = "gpcm", itemtype = "graded") > ###ERROR### > > # Calculate factor scores for dimension 1 (adjust the dimension as needed) > irt_person_abilities_dim1 <- fscores(gpcm_model, method = "EAP", dims = 1) > > > #b(RMT) > rasch_person_abilities <- person.parameter(rasch_model)$theta > > > #7- Perform model comparison using fit statistics (e.g., AIC, BIC) > irt_aic <- AIC(irt_model) > rasch_aic <- AIC(rasch_model) > > irt_bic <- BIC(irt_model) > rasch_bic <- BIC(rasch_model) > > #8- Print or visualize the results for comparison > print("Item Parameter Estimates:") > print(irt_item_parameters) > print(rasch_item_parameters) > > print("Person Ability Estimates:") > print(irt_person_abilities) ###ERROR### > print(rasch_person_abilities) > > print("Model Fit Statistics:") > print(paste("IRT AIC:", irt_aic)) > print(paste("Rasch AIC:", rasch_aic)) > > print(paste("IRT BIC:", irt_bic)) > print(paste("Rasch BIC:", rasch_bic)) > > Hope to get your response. > > Many thanks. > > Regards, > > Norhayati > > -- > > > > > *PENAFIAN: *E-mel ini dan apa-apa fail yang dihantar > bersama-samanya > ("Mesej") adalah dihasratkan hanya untuk kegunaan > penerima yang dinyatakan > di atas dan mungkin mengandungi maklumat yang tidak > umum, bermilik, > istimewa, sulit dan dikecualikan dari penzahiran di bawah > undang-undang > yang terpakai termasuklah Akta Rahsia Rasmi 1972. *BACA SELANJUTNYA...* > <https://mail.uitm.edu.my/index.php?option=com_content&view=article&id=83>*DISCLAIMER > > :** This e-mail and any files transmitted with it > ("Message") is intended > only for the use of the recipient(s) named > above and may contain > information that is non-public, proprietary, > privileged, confidential > and > exempt from disclosure under applicable law including the > Official > Secrets Act 1972. **READ MORE...* > <https://mail.uitm.edu.my/index.php?option=com_content&view=article&id=83> > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- John Kane Kingston ON Canada [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Issue with gc() on Ubuntu 20.04
On 27-08-2023 21:02, Ivan Krylov wrote: On Sun, 27 Aug 2023 19:54:23 +0100 John Logsdon wrote: Not so although it did lower the gc() time to 95.84%. This was on a 16 core Threadripper 1950X box so I was intending to use library parallel but I tried it on my lowly windows box that is years old and got it down to 88.07%. Does the Windows box have the same version of R on it? Yes, they are both 4.3.1 The only thing I can think of is that there are quite a lot of cases where a function is generated on the fly as in: eval(parse(t=paste("dprob <- function(x,l,s){",dist.functions[2,][dist.functions[1,]==distn],"(x,l,s)}",sep=""))) This isn't very idiomatic. If you need dprob to call the function named in dist.functions[2,][dist.functions[1,]==distn], wouldn't it be easier for R to assign that function straight to dprob? dprob <- get(dist.functions[2,][dist.functions[1,]==distn]) This way, you avoid the need to parse the code, which is typically not the fastest part of a programming language. (Generally in R and other programming languages with recursive data structures, storing variable names in other variables is not very efficient. Why not put functions directly into a list?) Agreed but this statement and other similar ones are only assigned once in an outer loop. Rprof() samples the whole call stack. Can you find out which functions result in a call to gc()? I haven't experimented with a wide sample of R code, but I don't usually encounter gc() as a major entry in my Rprof() outputs. From the first table, removing all the system functions, it suggests that the function do.combx() is mainly guilty. I have recoded that and gc() no longer appears - as it shouldn't with it switched off! One difference was that the new code used the built in combn function while the old code used gtools::combinations. I need gtools::permutations elsewhere but that is not time critical. Thanks Ivan for making me think! -- John Logsdon Quantex Research Ltd m:+447717758675/h:+441614454951 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Issue with gc() on Ubuntu 20.04
Folks I have come across an issue with gc() hogging the processor according to Rprof. Platform is Ubuntu 20.04 all up to date R version 4.3.1 libraries: survival, MASS, gtools and openxlsx. With default gc.auto options, the profiler notes the garbage collector as self.pct 99.39%. So I have tried switching it off using options(gc.auto=Inf) in the R session before running my program using source(). This lowered self.pct to 99.36. Not much there. After some pondering, I added an options(gc.auto=Inf) at the beginning of each function, not resetting it at exit, but expecting the offending function(s) to plead guilty. Not so although it did lower the gc() time to 95.84%. This was on a 16 core Threadripper 1950X box so I was intending to use library parallel but I tried it on my lowly windows box that is years old and got it down to 88.07%. The only thing I can think of is that there are quite a lot of cases where a function is generated on the fly as in: eval(parse(t=paste("dprob <- function(x,l,s){",dist.functions[2,][dist.functions[1,]==distn],"(x,l,s)}",sep=""))) I haven't added the options to any of these. The highest time used by any of my functions is 0.05% - the rest is dominated by gc(). There may not be much point in parallising the code until I can reduce the garbage collection. I am not short of memory and would like to disable it fully but despite adding to all routines, I haven't managed to do this yet. Can anyone advise me? And why is the Linux version so much worse than Windows? TIA -- John Logsdon Quantex Research Ltd m:+447717758675/h:+441614454951 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Determining Starting Values for Model Parameters in Nonlinear Regression
Dear John, John, and Paul, In this case, one can start values by just fitting > lm(1/y ~ x1 + x2 + x3 - 1, data=mydata) Call: lm(formula = 1/y ~ x1 + x2 + x3 - 1, data = mydata) Coefficients: x1 x2 x3 0.00629 0.00868 0.00803 Of course, the errors enter this model differently, so this isn't the same as the nonlinear model, but the regression coefficients are very close to the estimates for the nonlinear model. Best, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://www.john-fox.ca/ On 2023-08-19 6:39 p.m., Sorkin, John wrote: Caution: External email. Colleagues, At the risk of starting a forest fire, or perhaps a brush fire, while it is good to see that nlxb can find a solution from arbitrary starting values, I think Paul’s question has merit despite Professor Nash’s excellent and helpful observation. Although non-linear algorithms can converge, they can converge to a false solution if starting values are sub-optimally specified. When possible, I try to specify thought-out starting values. Would it make sense to plot y as a function of (x1, x2) at different values of x3 to get a sense of possible starting values? Or, perhaps using median values of x1, x2, and x3 as starting values. Comparing results from different starting values can give some confidence that the solution obtained using arbitrary starting values are likely “correct”. I freely admit that my experience (and thus expertise) using non-linear solutions is limited. Please do not flame me, I am simply urging caution. John John David Sorkin M.D., Ph.D. Professor of Medicine Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) On Aug 19, 2023, at 4:35 PM, J C Nash mailto:profjcn...@gmail.com>> wrote: Why bother. nlsr can find a solution from very crude start. Mixture <- c(17, 14, 5, 1, 11, 2, 16, 7, 19, 23, 20, 6, 13, 21, 3, 18, 15, 26, 8, 22) x1 <- c(69.98, 72.5, 77.6, 79.98, 74.98, 80.06, 69.98, 77.34, 69.99, 67.49, 67.51, 77.63, 72.5, 67.5, 80.1, 69.99, 72.49, 64.99, 75.02, 67.48) x2 <- c(29, 25.48, 21.38, 19.85, 22, 18.91, 29.99, 19.65, 26.99, 29.49, 32.47, 20.35, 26.48, 31.47, 16.87, 27.99, 24.49, 31.99, 24.96, 30.5) x3 <- c(1, 2, 1, 0, 3, 1, 0, 2.99, 3, 3, 0, 2, 1, 1, 3, 2, 3, 3, 0, 2) y <- c(1.4287, 1.4426, 1.4677, 1.4774, 1.4565, 1.4807, 1.4279, 1.4684, 1.4301, 1.4188, 1.4157, 1.4686, 1.4414, 1.4172, 1.4829, 1.4291, 1.4438, 1.4068, 1.4524, 1.4183) mydata<-data.frame(Mixture, x1, x2, x3, y) mydata mymod <- y ~ 1/(Beta1*x1 + Beta2*x2 + Beta3*x3) library(nlsr) strt<-c(Beta1=1, Beta2=2, Beta3=3) trysol<-nlxb(formula=mymod, data=mydata, start=strt, trace=TRUE) trysol # or pshort(trysol) Output is residual sumsquares = 1.5412e-05 on 20 observations after 29Jacobian and 43 function evaluations namecoeff SE tstat pval gradient JSingval Beta1 0.00629212 5.997e-06 1049 2.425e-42 4.049e-08 721.8 Beta2 0.00867741 1.608e-05 539.7 1.963e-37 -2.715e-08 56.05 Beta3 0.00801948 8.809e-05 91.03 2.664e-24 1.497e-08 10.81 J Nash On 2023-08-19 16:19, Paul Bernal wrote: Dear friends, Hope you are all doing well and having a great weekend. I have data that was collected on specific gravity and spectrophotometer analysis for 26 mixtures of NG (nitroglycerine), TA (triacetin), and 2 NDPA (2 - nitrodiphenylamine). In the dataset, x1 = %NG, x2 = %TA, and x3 = %2 NDPA. The response variable is the specific gravity, and the rest of the variables are the predictors. This is the dataset: dput(mod14data_random) structure(list(Mixture = c(17, 14, 5, 1, 11, 2, 16, 7, 19, 23, 20, 6, 13, 21, 3, 18, 15, 26, 8, 22), x1 = c(69.98, 72.5, 77.6, 79.98, 74.98, 80.06, 69.98, 77.34, 69.99, 67.49, 67.51, 77.63, 72.5, 67.5, 80.1, 69.99, 72.49, 64.99, 75.02, 67.48), x2 = c(29, 25.48, 21.38, 19.85, 22, 18.91, 29.99, 19.65, 26.99, 29.49, 32.47, 20.35, 26.48, 31.47, 16.87, 27.99, 24.49, 31.99, 24.96, 30.5), x3 = c(1, 2, 1, 0, 3, 1, 0, 2.99, 3, 3, 0, 2, 1, 1, 3, 2, 3, 3, 0, 2), y = c(1.4287, 1.4426, 1.4677, 1.4774, 1.4565, 1.4807, 1.4279, 1.4684, 1.4301, 1.4188, 1.4157, 1.4686, 1.4414, 1.4172, 1.4829, 1.4291, 1.4438, 1.4068, 1.4524, 1.4183)), row.names = c(NA, -20L), class = "data.frame") The model is the following: y = 1/(Beta1x1 + Beta2x2 + Beta3x3) I need to determine starting (initial) values for the model parameters for this nonlinear regression model, any ideas on how to accomplish this using R? Cheers, Paul [[alternative HTML version deleted]]
Re: [R] Determining Starting Values for Model Parameters in Nonlinear Regression
Colleagues, At the risk of starting a forest fire, or perhaps a brush fire, while it is good to see that nlxb can find a solution from arbitrary starting values, I think Paul’s question has merit despite Professor Nash’s excellent and helpful observation. Although non-linear algorithms can converge, they can converge to a false solution if starting values are sub-optimally specified. When possible, I try to specify thought-out starting values. Would it make sense to plot y as a function of (x1, x2) at different values of x3 to get a sense of possible starting values? Or, perhaps using median values of x1, x2, and x3 as starting values. Comparing results from different starting values can give some confidence that the solution obtained using arbitrary starting values are likely “correct”. I freely admit that my experience (and thus expertise) using non-linear solutions is limited. Please do not flame me, I am simply urging caution. John John David Sorkin M.D., Ph.D. Professor of Medicine Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) On Aug 19, 2023, at 4:35 PM, J C Nash mailto:profjcn...@gmail.com>> wrote: Why bother. nlsr can find a solution from very crude start. Mixture <- c(17, 14, 5, 1, 11, 2, 16, 7, 19, 23, 20, 6, 13, 21, 3, 18, 15, 26, 8, 22) x1 <- c(69.98, 72.5, 77.6, 79.98, 74.98, 80.06, 69.98, 77.34, 69.99, 67.49, 67.51, 77.63, 72.5, 67.5, 80.1, 69.99, 72.49, 64.99, 75.02, 67.48) x2 <- c(29, 25.48, 21.38, 19.85, 22, 18.91, 29.99, 19.65, 26.99, 29.49, 32.47, 20.35, 26.48, 31.47, 16.87, 27.99, 24.49, 31.99, 24.96, 30.5) x3 <- c(1, 2, 1, 0, 3, 1, 0, 2.99, 3, 3, 0, 2, 1, 1, 3, 2, 3, 3, 0, 2) y <- c(1.4287, 1.4426, 1.4677, 1.4774, 1.4565, 1.4807, 1.4279, 1.4684, 1.4301, 1.4188, 1.4157, 1.4686, 1.4414, 1.4172, 1.4829, 1.4291, 1.4438, 1.4068, 1.4524, 1.4183) mydata<-data.frame(Mixture, x1, x2, x3, y) mydata mymod <- y ~ 1/(Beta1*x1 + Beta2*x2 + Beta3*x3) library(nlsr) strt<-c(Beta1=1, Beta2=2, Beta3=3) trysol<-nlxb(formula=mymod, data=mydata, start=strt, trace=TRUE) trysol # or pshort(trysol) Output is residual sumsquares = 1.5412e-05 on 20 observations after 29Jacobian and 43 function evaluations namecoeff SE tstat pval gradient JSingval Beta1 0.00629212 5.997e-06 1049 2.425e-42 4.049e-08 721.8 Beta2 0.00867741 1.608e-05 539.7 1.963e-37 -2.715e-08 56.05 Beta3 0.00801948 8.809e-05 91.03 2.664e-24 1.497e-08 10.81 J Nash On 2023-08-19 16:19, Paul Bernal wrote: Dear friends, Hope you are all doing well and having a great weekend. I have data that was collected on specific gravity and spectrophotometer analysis for 26 mixtures of NG (nitroglycerine), TA (triacetin), and 2 NDPA (2 - nitrodiphenylamine). In the dataset, x1 = %NG, x2 = %TA, and x3 = %2 NDPA. The response variable is the specific gravity, and the rest of the variables are the predictors. This is the dataset: dput(mod14data_random) structure(list(Mixture = c(17, 14, 5, 1, 11, 2, 16, 7, 19, 23, 20, 6, 13, 21, 3, 18, 15, 26, 8, 22), x1 = c(69.98, 72.5, 77.6, 79.98, 74.98, 80.06, 69.98, 77.34, 69.99, 67.49, 67.51, 77.63, 72.5, 67.5, 80.1, 69.99, 72.49, 64.99, 75.02, 67.48), x2 = c(29, 25.48, 21.38, 19.85, 22, 18.91, 29.99, 19.65, 26.99, 29.49, 32.47, 20.35, 26.48, 31.47, 16.87, 27.99, 24.49, 31.99, 24.96, 30.5), x3 = c(1, 2, 1, 0, 3, 1, 0, 2.99, 3, 3, 0, 2, 1, 1, 3, 2, 3, 3, 0, 2), y = c(1.4287, 1.4426, 1.4677, 1.4774, 1.4565, 1.4807, 1.4279, 1.4684, 1.4301, 1.4188, 1.4157, 1.4686, 1.4414, 1.4172, 1.4829, 1.4291, 1.4438, 1.4068, 1.4524, 1.4183)), row.names = c(NA, -20L), class = "data.frame") The model is the following: y = 1/(Beta1x1 + Beta2x2 + Beta3x3) I need to determine starting (initial) values for the model parameters for this nonlinear regression model, any ideas on how to accomplish this using R? Cheers, Paul [[alternative HTML version deleted]] __ R-help@r-project.org<mailto:R-help@r-project.org> mailing list -- To UNSUBSCRIBE and more, see https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7CJSorkin%40som.umaryland.edu%7C34eca026294a401cee6e08dba0f3e0d0%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638280741555924966%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=aQ9ApqQ%2BTJfvErHnTy4196dqj%2FZ2ed4vjXp50%2F%2B8uRs%3D&reserved=0<https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide https://nam11.s
Re: [R] Could not read time series data using read.zoo()
One reason seems to be you are saying sep = "," and there is no "," in the file. Also you only have 3 columns of data but 4 variable names. On Thu, 3 Aug 2023 at 10:53, Christofer Bogaso wrote: > Hi, > > I have a CSV which contains data like below (only first few rows), > > Date Adj Close lret > 02-01-1997 737.01 > 03-01-1997 748.03 1.48416235 > 06-01-1997 747.65 -0.050813009 > 07-01-1997 753.23 0.743567202 > 08-01-1997 748.41 -0.64196699 > 09-01-1997 754.85 0.856809786 > 10-01-1997 759.5 0.614126802 > > However when I try to read this data using below code I get error, > > read.zoo("1.csv", sep = ',', format = '%d-%m-%Y') > > Error reads as, > > index has 4500 bad entries at data rows: 1 2 3 4 5 6 7 8 9. > > Could you please help to understand why I am getting this error? > > > sessionInfo() > > R version 4.2.2 (2022-10-31) > > Platform: x86_64-apple-darwin17.0 (64-bit) > > Running under: macOS Big Sur ... 10.16 > > > Matrix products: default > > BLAS: > /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib > > LAPACK: > /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib > > > locale: > > [1] C/UTF-8/C/C/C/C > > > attached base packages: > > [1] stats graphics grDevices utils datasets methods base > > > other attached packages: > > [1] zoo_1.8-12 > > > loaded via a namespace (and not attached): > > [1] compiler_4.2.2 tools_4.2.2 grid_4.2.2 lattice_0.20-45 > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- John Kane Kingston ON Canada [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Obtaining R-squared from All Possible Combinations of Linear Models Fitted
MuMln is a package designed to select optimum models mainly based on information criteria. R-squared is not a suitable criterion for this purpose. As far as I can see is not covered in this package. (I presume you already know that R-squared for the model with all possible regressors is at least as great as R with any subset of the regressors). If you want to calculate all these R-squared's it should be easy to write a small routine to estimate them. I am very curious as to why you wish to do this. John C Frain. 3 Aranleigh Park Rathfarnham Dublin 14 Ireland www.tcd.ie/Economics/staff/frainj/home.html https://jcfrain.wordpress.com/ https://jcfraincv19.wordpress.com/ mailto:fra...@tcd.ie mailto:fra...@gmail.com On Mon, 17 Jul 2023 at 18:25, Paul Bernal wrote: > Dear friends, > > I need to automatically fit all possible linear regression models (with all > possible combinations of regressors), and found the MuMIn package, which > has the dredge function. > > This is the dataset I am working with: > > dput(final_frame) > structure(list(y = c(41.9, 44.5, 43.9, 30.9, 27.9, 38.9, 30.9, > 28.9, 25.9, 31, 29.5, 35.9, 37.5, 37.9), x1 = c(6.6969, 8.7951, > 9.0384, 5.9592, 4.5429, 8.3607, 5.898, 5.6039, 4.9176, 6.2712, > 5.0208, 5.8282, 5.9894, 7.5422), x4 = c(1.488, 1.82, 1.5, 1.121, > 1.175, 1.777, 1.24, 1.501, 0.998, 0.975, 1.5, 1.225, 1.256, 1.69 > ), x8 = c(22, 50, 23, 32, 40, 48, 51, 32, 42, 30, 62, 32, 40, > 22), x2 = c(1.5, 1.5, 1, 1, 1, 1.5, 1, 1, 1, 1, 1, 1, 1, 1.5), > x7 = c(3, 4, 3, 3, 3, 4, 3, 3, 4, 2, 4, 3, 3, 3)), class = > "data.frame", row.names = c(NA, > -14L)) > > I started with the all regressor model, which I called globalmodel as > follows: > #Fitting Regression model with all possible combinations of regressors > options(na.action = "na.fail") # change the default "na.omit" to prevent > models > globalmodel <- lm(y~., data=final_frame) > > Then, the following code provides the different coefficients (for > regressors and the intercept) for each of the possible model combinations: > combinations <- dredge(globalmodel) > print(combinations) > I would like to retrieve the R-squared generated by each combination, but > have not been able to get it thus far. > > Any guidance on how to retrieve the R-squared from all linear model > combinations would be greatly appreciated. > > Kind regards, > Paul > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot: Can plot graphs with points, can't plot graph with points and line
Hi John, This should do what you want. I've changed your data.frame name for my own convenience to "dat1". ###=== dat1 <- data.frame( Time = c("Age.25","Age.35","Age.45","Age.55"), Medians = c(128.25,148.75,158.5,168.75) ) # create segments data.frame dat2 <- data.frame(x = dat1$Time[1:3], xend = dat1$Time[2:4], y = dat1$Medians[1:3], yend = dat1$Medians[2:4]) p1 <- ggplot(dat1 ,aes(x = Time, y = Medians)) + geom_point() p1 + geom_segment( x = "Age.25", y = 128.25, xend = "Age.35", yend = 148.75) + geom_segment( x = "Age.35", y = 148.75, xend = "Age.45", yend = 158.5) + geom_segment( x = "Age.45", y = 158.5, xend = "Age.55", yend = 168.55) # This solution shamelessly stolen from ## https://stackoverflow.com/questions/62536499/how-to-draw-multiple-line-segment-in-ggplot p1 + geom_segment( data = dat2, mapping = aes(x=x, y=y, xend=xend, yend=yend), inherit.aes = FALSE ) On Thu, 13 Jul 2023 at 01:11, Jim Lemon wrote: > Hi John, > I'm not sure how to do this with ggplot, but: > > Time<- c("Age.25","Age.35","Age.45","Age.55") > Medians<-c(128.25,148.75,158.5,168.75) > > is.character(Time) > # [1] TRUE - thus it has no intrinsic numeric value to plot > > is.numeric(Medians) > # [1] TRUE > # coerce Medians to factor and then plot against Time, but can't do > point/line > plot(as.factor(Time),Medians,type="p") > # let R determine the x values (1:4) and omit the x-axis > plot(Medians,type="b",xaxt="n") > # add the x-axis with the "Time" labels > axis(1,at=1:4,labels=Time) > > > On Thu, Jul 13, 2023 at 11:18 AM Sorkin, John > wrote: > > > > I am trying to plot four points, and join the points with lines. I can > plot the points, but I can't plot the points and the line. > > I hope someone can help my with my ggplot code. > > > > # load ggplot2 > > if(!require(ggplot2)){install.packages("ggplot2")} > > library(ggplot2) > > > > # Create data > > Time <- c("Age.25","Age.35","Age.45","Age.55") > > Medians<-c(128.25,148.75,158.5,168.75) > > themedians <- matrix(data=cbind(Time,Medians),nrow=4,ncol=2) > > dimnames(themedians) <- list(NULL,c("Time","Median")) > > # Convert to dataframe the data format used by ggplot > > themedians <- data.frame(themedians) > > themedians > > > > # This plot works > > ggplot(themedians,aes(x=Time,y=Median))+ > > geom_point() > > # This plot does not work! > > ggplot(themedians,aes(x=Time,y=Median))+ > > geom_point()+ > > geom_line() > > > > Thank you, > > John > > > > __ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- John Kane Kingston ON Canada [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ggplot: Can plot graphs with points, can't plot graph with points and line
I am trying to plot four points, and join the points with lines. I can plot the points, but I can't plot the points and the line. I hope someone can help my with my ggplot code. # load ggplot2 if(!require(ggplot2)){install.packages("ggplot2")} library(ggplot2) # Create data Time <- c("Age.25","Age.35","Age.45","Age.55") Medians<-c(128.25,148.75,158.5,168.75) themedians <- matrix(data=cbind(Time,Medians),nrow=4,ncol=2) dimnames(themedians) <- list(NULL,c("Time","Median")) # Convert to dataframe the data format used by ggplot themedians <- data.frame(themedians) themedians # This plot works ggplot(themedians,aes(x=Time,y=Median))+ geom_point() # This plot does not work! ggplot(themedians,aes(x=Time,y=Median))+ geom_point()+ geom_line() Thank you, John __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Getting an error calling MASS::boxcox in a function
Hi Bert, On 2023-07-08 3:42 p.m., Bert Gunter wrote: Caution: This email may have originated from outside the organization. Please exercise additional caution with any links and attachments. Thanks John. ?boxcox says: * Arguments object a formula or fitted model object. Currently only lm and aov objects are handled. * I read that as saying that boxcox(lm(z+1 ~ 1),...) should run without error. But it didn't. And perhaps here's why: BoxCoxLambda <- function(z){ b <- MASS:::boxcox.lm(lm(z+1 ~ 1), lambda = seq(-5, 5, length.out = 61), plotit = FALSE) b$x[which.max(b$y)]# best lambda } lambdas <- apply(dd,2 , BoxCoxLambda) Error in NextMethod() : 'NextMethod' called from an anonymous function and, indeed, ?UseMethod says: "NextMethod should not be called except in methods called by UseMethod or from internal generics (see InternalGenerics). In particular it will not work inside anonymous calling functions (e.g., get("print.ts")(AirPassengers))." BUT BoxCoxLambda <- function(z){ b <- MASS:::boxcox(z+1 ~ 1, lambda = seq(-5, 5, length.out = 61), plotit = FALSE) b$x[which.max(b$y)]# best lambda } lambdas <- apply(dd,2 , BoxCoxLambda) lambdas [1] 0.167 0.167 As it turns out, it's the update() step in boxcox.lm() that fails, and the update takes place because $y is missing from the lm object, so the following works: BoxCoxLambda <- function(z){ b <- boxcox(lm(z + 1 ~ 1, y=TRUE), lambda = seq(-5, 5, length.out = 101), plotit = FALSE) b$x[which.max(b$y)] } The identical lambdas do not seem right to me; I think that's just an accident of the example (using the BoxCoxLambda() above): > apply(dd, 2, BoxCoxLambda, simplify = TRUE) [1] 0.2 0.2 > dd[, 2] <- dd[, 2]^3 > apply(dd, 2, BoxCoxLambda, simplify = TRUE) [1] 0.2 0.1 Best, John nor do I understand why boxcox.lm apparently throws the error while boxcox.formula does not (it also calls NextMethod()) So I would welcome clarification to clear my clogged (cerebral) sinuses. :-) Best, Bert On Sat, Jul 8, 2023 at 11:25 AM John Fox wrote: Dear Ron and Bert, First (and without considering why one would want to do this, e.g., adding a start of 1 to the data), the following works for me: -- snip -- > library(MASS) > BoxCoxLambda <- function(z){ + b <- boxcox(z + 1 ~ 1, + lambda = seq(-5, 5, length.out = 101), + plotit = FALSE) + b$x[which.max(b$y)] + } > mrow <- 500 > mcol <- 2 > set.seed(12345) > dd <- matrix(rgamma(mrow*mcol, shape = 2, scale = 5), nrow = mrow, ncol = +mcol) > dd1 <- dd[, 1] # 1st column of dd > res <- boxcox(lm(dd1 + 1 ~ 1), lambda = seq(-5, 5, length.out = 101), plotit + = FALSE) > res$x[which.max(res$y)] [1] 0.2 > apply(dd, 2, BoxCoxLambda, simplify = TRUE) [1] 0.2 0.2 -- snip -- One could also use the powerTransform() function in the car package, which in this context transforms towards *multi*normality: -- snip -- > library(car) Loading required package: carData > powerTransform(dd + 1) Estimated transformation parameters Y1Y2 0.1740200 0.2089925 I hope this helps, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://www.john-fox.ca/ On 2023-07-08 12:47 p.m., Bert Gunter wrote: Caution: This email may have originated from outside the organization. Please exercise additional caution with any links and attachments. No, I'm afraid I'm wrong. Something went wrong with my R session and gave me incorrect answers. After restarting, I continued to get the same error as you did with my supposed "fix." So just ignore what I said and sorry for the noise. -- Bert On Sat, Jul 8, 2023 at 8:28 AM Bert Gunter wrote: Try this for your function: BoxCoxLambda <- function(z){ y <- z b <- boxcox(y + 1 ~ 1,lambda = seq(-5, 5, length.out = 61), plotit = FALSE) b$x[which.max(b$y)]# best lambda } ***I think*** (corrections and clarification strongly welcomed!) that `~` (the formula function) is looking for 'z' in the GlobalEnv, the caller of apply(), and not finding it. It finds 'y' here explicitly in the BoxCoxLambda environment. Cheers, Bert On Sat, Jul 8, 2023 at 4:28 AM Ron Crump via R-help wrote: Hi, Firstly, apologies as I have posted this on community.rstudio.com too. I want to optimise a Box-Cox transformation on columns of a matrix (ie, a unique lambda for each column). So I wrote a function that includes the call to MASS::boxcox in order that it can be applied to each column easily. Except that I'm getting an error when calling the function. If I just extract a column of the matrix
Re: [R] Getting an error calling MASS::boxcox in a function
Dear Ron and Bert, First (and without considering why one would want to do this, e.g., adding a start of 1 to the data), the following works for me: -- snip -- > library(MASS) > BoxCoxLambda <- function(z){ + b <- boxcox(z + 1 ~ 1, + lambda = seq(-5, 5, length.out = 101), + plotit = FALSE) + b$x[which.max(b$y)] + } > mrow <- 500 > mcol <- 2 > set.seed(12345) > dd <- matrix(rgamma(mrow*mcol, shape = 2, scale = 5), nrow = mrow, ncol = +mcol) > dd1 <- dd[, 1] # 1st column of dd > res <- boxcox(lm(dd1 + 1 ~ 1), lambda = seq(-5, 5, length.out = 101), plotit + = FALSE) > res$x[which.max(res$y)] [1] 0.2 > apply(dd, 2, BoxCoxLambda, simplify = TRUE) [1] 0.2 0.2 -- snip -- One could also use the powerTransform() function in the car package, which in this context transforms towards *multi*normality: -- snip -- > library(car) Loading required package: carData > powerTransform(dd + 1) Estimated transformation parameters Y1Y2 0.1740200 0.2089925 I hope this helps, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://www.john-fox.ca/ On 2023-07-08 12:47 p.m., Bert Gunter wrote: Caution: This email may have originated from outside the organization. Please exercise additional caution with any links and attachments. No, I'm afraid I'm wrong. Something went wrong with my R session and gave me incorrect answers. After restarting, I continued to get the same error as you did with my supposed "fix." So just ignore what I said and sorry for the noise. -- Bert On Sat, Jul 8, 2023 at 8:28 AM Bert Gunter wrote: Try this for your function: BoxCoxLambda <- function(z){ y <- z b <- boxcox(y + 1 ~ 1,lambda = seq(-5, 5, length.out = 61), plotit = FALSE) b$x[which.max(b$y)]# best lambda } ***I think*** (corrections and clarification strongly welcomed!) that `~` (the formula function) is looking for 'z' in the GlobalEnv, the caller of apply(), and not finding it. It finds 'y' here explicitly in the BoxCoxLambda environment. Cheers, Bert On Sat, Jul 8, 2023 at 4:28 AM Ron Crump via R-help wrote: Hi, Firstly, apologies as I have posted this on community.rstudio.com too. I want to optimise a Box-Cox transformation on columns of a matrix (ie, a unique lambda for each column). So I wrote a function that includes the call to MASS::boxcox in order that it can be applied to each column easily. Except that I'm getting an error when calling the function. If I just extract a column of the matrix and run the code not in the function, it works. If I call the function either with an extracted column (ie dd1 in the reprex below) or in a call to apply I get an error (see the reprex below). I'm sure I'm doing something silly, but I can't see what it is. Any help appreciated. library(MASS) # Find optimised Lambda for Boc-Cox transformation BoxCoxLambda <- function(z){ b <- boxcox(lm(z+1 ~ 1), lambda = seq(-5, 5, length.out = 61), plotit = FALSE) b$x[which.max(b$y)]# best lambda } mrow <- 500 mcol <- 2 set.seed(12345) dd <- matrix(rgamma(mrow*mcol, shape = 2, scale = 5), nrow = mrow, ncol = mcol) # Try it not using the BoxCoxLambda function: dd1 <- dd[,1] # 1st column of dd bb <- boxcox(lm(dd1+1 ~ 1), lambda = seq(-5, 5, length.out = 101), plotit = FALSE) print(paste0("1st column's lambda is ", bb$x[which.max(bb$y)])) #> [1] "1st column's lambda is 0.2" # Calculate lambda for each column of dd lambdas <- apply(dd, 2, BoxCoxLambda, simplify = TRUE) #> Error in eval(predvars, data, env): object 'z' not found Created on 2023-07-08 with reprex v2.0.2 Thanks for your time and help. Ron [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Create a variable lenght string that can be used in a dimnames statement
My life is complete. I have inspired a fortune! John From: Rolf Turner Sent: Monday, July 3, 2023 6:34 PM To: Bert Gunter Cc: Sorkin, John; r-help@r-project.org (r-help@r-project.org); Achim Zeileis Subject: Re: [R] Create a variable lenght string that can be used in a dimnames statement On Mon, 3 Jul 2023 13:40:41 -0700 Bert Gunter wrote: > I am not going to try to sort out your confusion, as others have > already tried and failed. Fortune nomination!!! cheers, Rolf Turner -- Honorary Research Fellow Department of Statistics University of Auckland Stats. Dep't. (secretaries) phone: +64-9-373-7599 ext. 89622 Home phone: +64-9-480-4619 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Create a variable lenght string that can be used in a dimnames statement
Colleagues, I am sending this email again with a better description of my problem and the area where I need help. I need help creating a string of variables that will be accepted by the dimnames function. The string needs to start with the dimnames j and k followed by a series of dimnames xxx1, . . . ., xxx2, . . ., xxxn. I create xxx1, xxx2 (not going to xxxn to shorten the code below) as a string using a for loop and the paste function. I then use a paste function, zzz <- paste("j","k",string) to create the full set of dimnames, j, k, xxx1, xxx2 as string. I create the matrix myvalues in the usual way and attempt to assign dim names to the matrix using the following dimnames statement, dimnames(myvalues)<-list(NULL,c(zzz)) The dimnames statement leads to the following error, Error in dimnames(x) <- dn : length of 'dimnames' [2] not equal to array extent A colnames statement, colnames(myvalues)<-as.character(zzz) produces the same error. Can someone tell me how to create a sting that can be used in the dimnames statment? Thank you (and please accept my apologies for double posting). John # create variable names xxx1 and xxx2. string="" for (j in 1:2){ name <- paste("xxx",j,sep="") string <- paste(string,name) print(string) } # Creation of xxx1 and xxx2 works string # Create matrix myvalues <- matrix(nrow=2,ncol=4) head(myvalues,1) # Add "j" and "k" to the string of column names zzz <- paste("j","k",string) zzz # assign column names, j, k, xxx1, xxx2 to the matrix # create column names, j, k, xxx1, xxx2. dimnames(myvalues)<-list(NULL,c(zzz)) colnames(myvalues) <- string __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Create matrix with column names wiht the same prefix xxxx and that end in 1, 2
Jeff, Again my thanks for your guidance. I replaced dimnames(myvalues)<-list(NULL,c(zzz)) with colnames(myvalues)<-zzz and get the same error, Error in dimnames(x) <- dn : length of 'dimnames' [2] not equal to array extent It appears that I am creating the string zzz in a manner that is not compatable with either dimnames(myvalues)<-list(NULL,c(zzz)) or colnames(myvalues)<-zzz I think I need to modify the way I create the string zzz. # create variable names xxx1 and xxx2. string="" for (j in 1:2){ name <- paste("xxx",j,sep="") string <- paste(string,name) print(string) } # Creation of xxx1 and xxx2 works string # Create matrix myvalues <- matrix(nrow=2,ncol=4) head(myvalues,1) # Add "j" and "k" to the string of column names zzz <- paste("j","k",string) zzz # assign column names, j, k, xxx1, xxx2 to the matrix # create column names, j, k, xxx1, xxx2. dimnames(myvalues)<-list(NULL,c(zzz)) colnames(myvalues)<-zzz ____ From: Jeff Newmiller Sent: Monday, July 3, 2023 2:45 PM To: Sorkin, John Cc: r-help@r-project.org Subject: Re: [R] Create matrix with column names wiht the same prefix and that end in 1, 2 I really think you should read that help page. colnames() accesses the second element of dimnames() directly. On July 3, 2023 11:39:37 AM PDT, "Sorkin, John" wrote: >Jeff, >Thank you for your reply. >I should have said with dim names not column names. I want the Mateix to have >dim names, no row names, dim names j, k, xxx1, xxx2. > >John > >John David Sorkin M.D., Ph.D. >Professor of Medicine >Chief, Biostatistics and Informatics >University of Maryland School of Medicine Division of Gerontology and >Geriatric Medicine >Baltimore VA Medical Center >10 North Greene Street >GRECC (BT/18/GR) >Baltimore, MD 21201-1524 >(Phone) 410-605-7119 >(Fax) 410-605-7913 (Please call phone number above prior to >faxing) > >On Jul 3, 2023, at 2:11 PM, Jeff Newmiller wrote: > >?colnames > >On July 3, 2023 11:00:32 AM PDT, "Sorkin, John" >wrote: >I am trying to create an array, myvalues, having 2 rows and 4 columns, where >the column names are j,k,xxx1,xxx2. The code below fails, with the following >error, "Error in dimnames(myvalues) <- list(NULL, zzz) : >length of 'dimnames' [2] not equal to array extent" > >Please help me get the code to work. > >Thank you, >John > ># create variable names xxx1 and xxx2. >string="" >for (j in 1:2){ >name <- paste("xxx",j,sep="") >string <- paste(string,name) >print(string) >} ># Creation of xxx1 and xxx2 works >string > ># Create matrix >myvalues <- matrix(nrow=2,ncol=4) >head(myvalues,1) ># Add "j" and "k" to the string of column names >zzz <- paste("j","k",string) >zzz ># assign column names, j, k, xxx1, xxx2 to the matrix ># create column names, j, k, xxx1, xxx2. >dimnames(myvalues)<-list(NULL,zzz) > > >__ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide http://www.r-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. > >-- >Sent from my phone. Please excuse my brevity. > >__ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide http://www.r-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. -- Sent from my phone. Please excuse my brevity. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Create matrix with column names wiht the same prefix xxxx and that end in 1, 2
Jeff, Thank you for your reply. I should have said with dim names not column names. I want the Mateix to have dim names, no row names, dim names j, k, xxx1, xxx2. John John David Sorkin M.D., Ph.D. Professor of Medicine Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) On Jul 3, 2023, at 2:11 PM, Jeff Newmiller wrote: ?colnames On July 3, 2023 11:00:32 AM PDT, "Sorkin, John" wrote: I am trying to create an array, myvalues, having 2 rows and 4 columns, where the column names are j,k,xxx1,xxx2. The code below fails, with the following error, "Error in dimnames(myvalues) <- list(NULL, zzz) : length of 'dimnames' [2] not equal to array extent" Please help me get the code to work. Thank you, John # create variable names xxx1 and xxx2. string="" for (j in 1:2){ name <- paste("xxx",j,sep="") string <- paste(string,name) print(string) } # Creation of xxx1 and xxx2 works string # Create matrix myvalues <- matrix(nrow=2,ncol=4) head(myvalues,1) # Add "j" and "k" to the string of column names zzz <- paste("j","k",string) zzz # assign column names, j, k, xxx1, xxx2 to the matrix # create column names, j, k, xxx1, xxx2. dimnames(myvalues)<-list(NULL,zzz) __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7CJSorkin%40som.umaryland.edu%7C4347c6a62c4b4956756708db7bf0ea2b%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638240046889096206%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=y8kLruSvrjxQLegbbPNMMl665EEApCgiSOq%2BEmhQfNE%3D&reserved=0 PLEASE do read the posting guide https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=05%7C01%7CJSorkin%40som.umaryland.edu%7C4347c6a62c4b4956756708db7bf0ea2b%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638240046889096206%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=HBUMNAeG1KurerS2DAhKxxZVRs71TSF0YJSGjP%2FCixA%3D&reserved=0 and provide commented, minimal, self-contained, reproducible code. -- Sent from my phone. Please excuse my brevity. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7CJSorkin%40som.umaryland.edu%7C4347c6a62c4b4956756708db7bf0ea2b%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638240046889096206%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=y8kLruSvrjxQLegbbPNMMl665EEApCgiSOq%2BEmhQfNE%3D&reserved=0 PLEASE do read the posting guide https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=05%7C01%7CJSorkin%40som.umaryland.edu%7C4347c6a62c4b4956756708db7bf0ea2b%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638240046889096206%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=HBUMNAeG1KurerS2DAhKxxZVRs71TSF0YJSGjP%2FCixA%3D&reserved=0 and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Create matrix with column names wiht the same prefix xxxx and that end in 1, 2
I am trying to create an array, myvalues, having 2 rows and 4 columns, where the column names are j,k,xxx1,xxx2. The code below fails, with the following error, "Error in dimnames(myvalues) <- list(NULL, zzz) : length of 'dimnames' [2] not equal to array extent" Please help me get the code to work. Thank you, John # create variable names xxx1 and xxx2. string="" for (j in 1:2){ name <- paste("xxx",j,sep="") string <- paste(string,name) print(string) } # Creation of xxx1 and xxx2 works string # Create matrix myvalues <- matrix(nrow=2,ncol=4) head(myvalues,1) # Add "j" and "k" to the string of column names zzz <- paste("j","k",string) zzz # assign column names, j, k, xxx1, xxx2 to the matrix # create column names, j, k, xxx1, xxx2. dimnames(myvalues)<-list(NULL,zzz) __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plotting factors in graph panel
gt; header=TRUE,stringsAsFactors=FALSE) > > at_df<-at_df[at_df$Income!="No_Answer",which(names(at_df)!="Bank_NA")] > > png("MF_Bank.png",height=600) > > par(mfrow=c(2,1)) > > matplot(at_df[,c("MF_None","MF_Equity","MF_Debt","MF_Hybrid")], > > type="l",col=1:4,lty=1:4,lwd=3, > > main="Percentages by Income and MF type", > > xlab="Income",ylab="Percentage of group",xaxt="n") > > axis(1,at=1:5,labels=at_df$Income) > > legend(3,24,c("MF_None","MF_Equity","MF_Debt","MF_Hybrid"), > > lty=1:4,lwd=3,col=1:4) > > matplot(at_df[,c("Bank_None","Bank_Current","Bank_Savings")], > > type="l",col=1:3,lty=1:4,lwd=3, > > main="Percentages by Income and Bank type", > > xlab="Income",ylab="Percentage of group",xaxt="n") > > axis(1,at=1:5,labels=at_df$Income) > > legend(3,54,c("Bank_None","Bank_Current","Bank_Savings"), > > lty=1:4,lwd=3,col=1:3) > > dev.off() > > > > Jim > > > > On Wed, Jun 28, 2023 at 6:33 PM Anupam Tyagi > wrote: > > > > > > Hello, > > > > > > I want to plot the following kind of data (percentage of respondents > > from a > > > survey) that varies by Income into many small *line* graphs in a > > > panel of graphs. I want to omit "No Answer" categories. I want to > > > see how each one of the categories (percentages), "None", " Equity", > > > etc. varies by > > Income. > > > How can I do this? How to organize the data well and how to plot? I > > thought > > > Lattice may be a good package to plot this, but I don't know for > > > sure. I prefer to do this in Base-R if possible, but I am open to > > > ggplot. Any > > ideas > > > will be helpful. > > > > > > Income > > > $10 $25 $40 $75 > $75 No Answer > > > MF 1 2 3 4 5 9 > > > None 1 3.05 2.29 2.24 1.71 1.30 2.83 Equity 2 29.76 28.79 29.51 > > > 28.90 31.67 36.77 Debt 3 31.18 32.64 34.31 35.65 37.59 33.15 Hybrid > > > 4 36.00 36.27 33.94 33.74 29.44 27.25 Bank AC None 1 46.54 54.01 > > > 59.1 62.17 67.67 60.87 Current 2 24.75 24.4 25 24.61 24.02 21.09 > > > Savings 3 25.4 18.7 29 11.48 7.103 13.46 No Answer 9 3.307 2.891 > > > 13.4 1.746 1.208 4.577 > > > > > > Thanks. > > > -- > > > Anupam. > > > > > > [[alternative HTML version deleted]] > > > > > > __ > > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > https://st/ > > > at.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7Ctebert%40ufl > > > .edu%7C59874e74164c46133f2c08db7853d28f%7C0d4da0f84a314d76ace60a6233 > > > 1e1b84%7C0%7C0%7C638236073642897221%7CUnknown%7CTWFpbGZsb3d8eyJWIjoi > > > MC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C > > > %7C%7C&sdata=xoaDMG7ogY4tMtqe30pONZrBdk0eq2cW%2BgdwlDHneWY%3D&reserv > > > ed=0 > > > PLEASE do read the posting guide > > http://www.r/ > > -project.org%2Fposting-guide.html&data=05%7C01%7Ctebert%40ufl.edu%7C59 > > 874e74164c46133f2c08db7853d28f%7C0d4da0f84a314d76ace60a62331e1b84%7C0% > > 7C0%7C638236073642897221%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiL > > CJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=H7 > > 6XCa%2FULBGUn0Lok93l6mtHzo0snq5G0a%2BL4sEH8%2F8%3D&reserved=0 > > > and provide commented, minimal, self-contained, reproducible code. > > > > > -- > Anupam. > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.r-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- John Kane Kingston ON Canada [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R does not run under latest RStudio
I have also had difficulty running R in RStudio. Has anyone else had problems? It will be a shame if we need to abandon R Studio. It is a very good IDE. John John David Sorkin M.D., Ph.D. Professor of Medicine Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) On Apr 6, 2023, at 5:30 PM, David Winsemius wrote: On 4/6/23 03:49, Steven Yen wrote: The RStudio list generally does not respond to free version users. I was hoping someone one this (R) list would be kind enough to help me. I don't think that is true. It is perhaps true that you cannot get personalized help from employed staff, but you can certainly submit to the Q&A forum. -- David Steven from iPhone On Apr 6, 2023, at 6:22 PM, Uwe Ligges wrote: No, but you need to ask on an RStudio mailing list. This one is about R. Best, Uwe Ligges On 06.04.2023 11:28, Steven T. Yen wrote: I updated to latest RStudio (RStudio-2023.03.0-386.exe) but R would not run. Error message: Error Starting R The R session failed to start. RSTUDIO VERSION RStudio 2023.03.0+386 "Cherry Blossom " (3c53477a, 2023-03-09) for Windows [No error available] I also tried RStudio 2022.12.0+353 --- same problem. I then tried another older version of RStudio (not sure version as I changed file name by accident) and R ran. Any clues? Please help. Thanks. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7CJSorkin%40som.umaryland.edu%7C93ce6a082163463da71b08db36e62f3c%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638164134503963420%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=EpNjOFynmxiXP3%2FkBx73iTmJJSX2cBXl92waOopal0A%3D&reserved=0 PLEASE do read the posting guide https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=05%7C01%7CJSorkin%40som.umaryland.edu%7C93ce6a082163463da71b08db36e62f3c%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638164134503963420%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=iOZi2L%2F6B9B3RawVWM5dZ8iJV3SeAJ1K8j5cq38m%2BAA%3D&reserved=0 and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7CJSorkin%40som.umaryland.edu%7C93ce6a082163463da71b08db36e62f3c%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638164134503963420%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=EpNjOFynmxiXP3%2FkBx73iTmJJSX2cBXl92waOopal0A%3D&reserved=0 PLEASE do read the posting guide https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=05%7C01%7CJSorkin%40som.umaryland.edu%7C93ce6a082163463da71b08db36e62f3c%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638164134503963420%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=iOZi2L%2F6B9B3RawVWM5dZ8iJV3SeAJ1K8j5cq38m%2BAA%3D&reserved=0 and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7CJSorkin%40som.umaryland.edu%7C93ce6a082163463da71b08db36e62f3c%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638164134503963420%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=EpNjOFynmxiXP3%2FkBx73iTmJJSX2cBXl92waOopal0A%3D&reserved=0 PLEASE do read the posting guide https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=05%7C01%7CJSorkin%40som.umaryland.edu%7C93ce6a082163463da71b08db36e62f3c%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638164134503963420%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=iOZi2L%2F6B9B3RawVWM5dZ8iJV3SeAJ1K8j5cq38m%2BAA%3D&reserved=0 and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-proje
Re: [R] R does not run under latest RStudio
On Thu, 6 Apr 2023 17:28:32 +0800 "Steven T. Yen" wrote: > I updated to latest RStudio (RStudio-2023.03.0-386.exe) but > R would not run. Error message: > > Error Starting R > The R session failed to start. > > RSTUDIO VERSION > RStudio 2023.03.0+386 "Cherry Blossom " (3c53477a, 2023-03-09) for > Windows [No error available] > > I also tried RStudio 2022.12.0+353 --- same problem. > > I then tried another older version of RStudio (not sure version > as I changed file name by accident) and R ran. > > Any clues? Please help. Thanks. > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html and provide commented, > minimal, self-contained, reproducible code. > Just to be thorough, what version of R are you running. RStudio is its own project, and they have shifted their emphasis somewhat regarding R somewhat. The web site now states that the organization - now called Posit - is not de-emphasizing R so much as extending to empbrase Python. The current version of RStudio requires R 3.3.0 or later. JWDougherty __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R does not run under latest RStudio
Does R run from a command prompt? If so, the problem is likely due to your Rstudio setup. If R does not run from a command prompt, any error messages might give some idea of the problem. I can run R and Rstudio in Windows 11?, Windows 10 and the current version of Linux Mint. On Thu 6 Apr 2023, 11:31 Uwe Ligges, wrote: > No, but you need to ask on an RStudio mailing list. > This one is about R. > > Best, > Uwe Ligges > > > > > On 06.04.2023 11:28, Steven T. Yen wrote: > > I updated to latest RStudio (RStudio-2023.03.0-386.exe) but > > R would not run. Error message: > > > > Error Starting R > > The R session failed to start. > > > > RSTUDIO VERSION > > RStudio 2023.03.0+386 "Cherry Blossom " (3c53477a, 2023-03-09) for > Windows > > [No error available] > > > > I also tried RStudio 2022.12.0+353 --- same problem. > > > > I then tried another older version of RStudio (not sure version > > as I changed file name by accident) and R ran. > > > > Any clues? Please help. Thanks. > > > > __ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Simple Stacking of Two Columns
Hi R-Helpers, Sorry to bother you, but I have a simple task that I can't figure out how to do. For example, I have some names in two columns NamesWide<-data.frame(Name1=c("Tom","Dick"),Name2=c("Larry","Curly")) and I simply want to get a single column NamesLong<-data.frame(Names=c("Tom","Dick","Larry","Curly")) > NamesLong Names 1 Tom 2 Dick 3 Larry 4 Curly Stack produces an error NamesLong<-stack(NamesWide$Name1,NamesWide$Names2) Error in if (drop) { : argument is of length zero So does bind_rows > NamesLong<-dplyr::bind_rows(NamesWide$Name1,NamesWide$Name2) Error in `dplyr::bind_rows()`: ! Argument 1 must be a data frame or a named atomic vector. Run `rlang::last_error()` to see where the error occurred. I tried making separate dataframes to get around the error in bind_rows but it puts the data in two different columns Name1<-data.frame(c("Tom","Dick")) Name2<-data.frame(c("Larry","Curly")) NamesLong<-dplyr::bind_rows(Name1,Name2) > NamesLong c..TomDick.. c..LarryCurly.. 1 Tom 2 Dick 3Larry 4Curly gather makes no change to the data NamesLong<-gather(NamesWide,Name1,Name2) > NamesLong Name1 Name2 1 Tom Larry 2 Dick Curly Please help me solve what should be a very simple problem. Thanks, John Sparks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to test the difference between paired correlations?
1. estimate r 2. do the z transformation - z is a simple function of r - z has an approximate standard normal distribution. 3. use the normal distribution tables to decide on the significance of z or of differences between two z's. I don't see the need for packages. John C Frain 3 Aranleigh Park Rathfarnham Dublin 14 Ireland www.tcd.ie/Economics/staff/frainj/home.html https://jcfrain.wordpress.com/ https://jcfraincv19.wordpress.com/ mailto:fra...@tcd.ie mailto:fra...@gmail.com On Thu, 23 Mar 2023 at 09:30, Luigi Marongiu wrote: > Thank you, but this now sounds more difficult: what would be the point > in having these ready-made functions if I have to do it manually? > Anyway, How would I implement the last part? > > On Thu, Mar 23, 2023 at 1:23 AM Ebert,Timothy Aaron > wrote: > > > > If you are open to other options: > > The null hypothesis is that there is no difference. > >If I have two equations y=x and y=z and there is no difference then > it would not matter if an observation was from x or z. > >Randomize the x and z observations. For each randomization calculate > a correlation for y=x and for y=z. > >At each iteration calculate the absolute value of the difference in > the correlations. > >Generate a frequency distribution from 100,000+ randomizations. > >Find the observed difference in the frequency from random > distributions. > >What proportion of observations are as large or larger than the > observed. This is your p-value. > > > > Tim > > > > -Original Message- > > From: R-help On Behalf Of Luigi Marongiu > > Sent: Wednesday, March 22, 2023 5:12 PM > > To: r-help > > Subject: [R] How to test the difference between paired correlations? > > > > [External Email] > > > > Hello, > > I have three numerical variables and I would like to test if their > correlation is significantly different. > > I have seen that there is a package that "Test the difference between > two (paired or unpaired) correlations". > > [ > https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.personality-project.org%2Fr%2Fhtml%2Fpaired.r.html&data=05%7C01%7Ctebert%40ufl.edu%7C35f2e7d6d9e844553c6408db2b1a337f%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C638151163767327230%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=S5T%2F1r%2BotV2BeL7S8bQFR0Avi4jDOuRX8N7LxACA6jg%3D&reserved=0 > ] > > However, there is the need to convert the correlations to "z scores > using the Fisher r-z transform". I have seen that there is another package > that does that [ > https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsearch.r-project.org%2FCRAN%2Frefmans%2FDescTools%2Fhtml%2FFisherZ.html&data=05%7C01%7Ctebert%40ufl.edu%7C35f2e7d6d9e844553c6408db2b1a337f%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C638151163767327230%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=gI3vIHV5UnFbPSmeMyuCVvg9hpFCdF33qNgAXmOQOXU%3D&reserved=0 > ]. > > Yet, I do not understand how to process the data. Shall I pass the raw > data or the correlations directly? > > > > I have made the following working example: > > ``` > > # define data > > v1 <- c(62.480, 59.492, 74.060, 88.519, 91.417, 53.907, 64.202, > 62.426, > > 54.406, 88.117) > > v2 <- c(56.814, 42.005, 56.074, 65.990, 81.572, 53.855, 50.335, 63.537, > 41.713, > > 78.265) > > v3 <- c(54.170, 64.224, 57.569, 85.089, 104.056, 48.713, 61.239, > 60.290, > > 67.308, 71.179) > > # visual exploration > > par(mfrow=c(2, 1)) > > plot(v2~v1, ylim=c(min(c(v1,v2,v3)), max(c(v1,v2,v3))), > > xlim=c(min(c(v1,v2,v3)), max(c(v1,v2,v3))), > > main="V1 vs V2") > > abline(lm(v2~v1)) > > plot(v3~v1, ylim=c(min(c(v1,v2,v3)), max(c(v1,v2,v3))), > > xlim=c(min(c(v1,v2,v3)), max(c(v1,v2,v3))), > > main="V1 vs V3") > > abline(lm(v3~v1)) > > ## test differences in correlation > > # convert raw data into z-scores > > library(psych) > > library(DescTools) > > FisherZ(v1) # I cannot convert the raw data into z scores (same for the > other variables): > > > [1] NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN Warning message: > > > In log((1 + rho)/(1 - rho)) : NaNs produced > > # convert correlations into z scores > > # (the correlation score of 0.79 has been converted into 1.08; is this > correct?) > > FisherZ(lm(v2~v1)$coefficients[2]) > > > v1 > > > 1.081667 >
Re: [R] loess plotting problem
Dear , On 2023-03-23 11:08 a.m., Anupam Tyagi wrote: Thanks, John. However, loess.smooth() is producing a very different curve compared to the one that results from applying predict() on a loess(). I am guessing they are using different defaults. Correct? No need to guess. Just look at the help pages ?loess and ?loess.smooth. If you don't like the default for loess.smooth(), just specify the arguments you want. Best, John On Thu, 23 Mar 2023 at 20:20, John Fox <mailto:j...@mcmaster.ca>> wrote: Dear Anupam Tyagi, You didn't include your data, so it's not possible to see exactly what happened, but I think that you misunderstand the object that loess() returns. It returns a "loess" object with several components, including the original data in x and y. So if pass the object to lines(), you'll simply connect the points, and if x isn't sorted, the points won't be in order. Try, e.g., plot(speed ~ dist, data=cars) m <- loess(speed ~ dist, data=cars) names(m) lines(m) You'd do better to use loess.smooth(), which is intended for adding a loess regression to a scatterplot; for example, plot(speed ~ dist, data=cars) with(cars, lines(loess.smooth(dist, speed))) Other points: You don't have to load the stats package which is available by default when you start R. It's best to avoid attach(), the use of which can cause confusion. I hope this helps, John -- * preferred email: john.david@proton.me <mailto:john.david@proton.me> John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://www.john-fox.ca/ <https://www.john-fox.ca/> On 2023-03-23 10:18 a.m., Anupam Tyagi wrote: > For some reason the following code is not plotting as I want it to. I want > to plot a "loess" line plotted over a scatter plot. I get a jumble, with > lines connecting all the points. I had a similar problem with "lowess". I > solved that by dropping "NA" rows from the data columns. Please help. > > library(stats) > attach(gini_pci_wdi_narm) > plot(ny_gnp_pcap_pp_kd, si_pov_gini) > lines(loess(si_pov_gini ~ ny_gnp_pcap_pp_kd, gini_pci_wdi_narm)) > detach(gini_pci_wdi_narm) > -- Anupam. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] loess plotting problem
Dear Anupam Tyagi, You didn't include your data, so it's not possible to see exactly what happened, but I think that you misunderstand the object that loess() returns. It returns a "loess" object with several components, including the original data in x and y. So if pass the object to lines(), you'll simply connect the points, and if x isn't sorted, the points won't be in order. Try, e.g., plot(speed ~ dist, data=cars) m <- loess(speed ~ dist, data=cars) names(m) lines(m) You'd do better to use loess.smooth(), which is intended for adding a loess regression to a scatterplot; for example, plot(speed ~ dist, data=cars) with(cars, lines(loess.smooth(dist, speed))) Other points: You don't have to load the stats package which is available by default when you start R. It's best to avoid attach(), the use of which can cause confusion. I hope this helps, John -- * preferred email: john.david@proton.me John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://www.john-fox.ca/ On 2023-03-23 10:18 a.m., Anupam Tyagi wrote: For some reason the following code is not plotting as I want it to. I want to plot a "loess" line plotted over a scatter plot. I get a jumble, with lines connecting all the points. I had a similar problem with "lowess". I solved that by dropping "NA" rows from the data columns. Please help. library(stats) attach(gini_pci_wdi_narm) plot(ny_gnp_pcap_pp_kd, si_pov_gini) lines(loess(si_pov_gini ~ ny_gnp_pcap_pp_kd, gini_pci_wdi_narm)) detach(gini_pci_wdi_narm) __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error: 'format_glimpse' is not an exported object from 'namespace:pillar'
I am receiving the following error message. I don't understand what it means, and I don't know how to fix it. I am running my code in R studio. I do not know if the error comes from R or RStudio. Please see session data below, Thank you, John version data: platform x86_64-w64-mingw32 arch x86_64 os mingw32 system x86_64, mingw32 status major 3 minor 6.1 year 2019 month 07 day05 svn rev76782 language R version.string R version 3.6.1 (2019-07-05) nickname Action of the Toes Rstudio.version() $mode [1] "desktop" $version [1] ‘2023.3.0.386’ $long_version [1] "2023.03.0+386" $release_name [1] "Cherry Blossom" __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Good Will Legal Question
Dear Timothy, On 2023-03-21 1:38 p.m., Ebert,Timothy Aaron wrote: My guess: It I clear from the link that they can use the R logo for commercial purposes. The issue is what to do about the "appropriate credit" and "link to the license." How would I do that on a hoodie? Would they need a web address or something? That's a good question, and one that I missed -- the implicit focus is on using the logo, e.g., in software. With the caveat that I'm not speaking for the R Foundation, I think that it would be sufficient to provide credit and a link to the license on the webpage that sells the hoodie. FWIW, I (and I expect you) have seen many t-shirts, etc., with R logos, some from companies, and I even have a few. I doubt that anyone will care. Best, John -Original Message----- From: R-help On Behalf Of John Fox Sent: Tuesday, March 21, 2023 1:19 PM To: Coding Hoodies Cc: r-help@r-project.org Subject: Re: [R] Good Will Legal Question [External Email] Dear Arid Sweeting, R-help is probably not the place to ask this question, although perhaps since you're seeking moral advice, people might want to say something. I would normally expect to see a query like this addressed to the R website webmasters, of which I'm one -- with the caveat that the R Foundation doesn't give legal advice. Just to be sure, you say that you read the rules for use of the R logo, so I assume that you've seen <https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.r-project.org%2Flogo%2F&data=05%7C01%7Ctebert%40ufl.edu%7C99f01774c9f5452bd99a08db2a31ec23%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C638150166126816193%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=jNvmCKITcZFcmqiRqkjqZnJVY3TYuD3wu3Mp0zhSHPs%3D&reserved=0>, which seems entirely clear to me. I think that it's safe to say that if the R Foundation wanted to limit commercial use of the R logo, it wouldn't have released it under the CC-BY-SA 4.0 license. I'm not sure what moral issues concern you. I hope this helps, John John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsocialsciences.mcmaster.ca%2Fjfox%2F&data=05%7C01%7Ctebert%40ufl.edu%7C99f01774c9f5452bd99a08db2a31ec23%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C638150166126816193%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=iLeUGFcyjk3kYNi2v8fV1jgc9M9OVdWYv9nJeI1G7Q4%3D&reserved=0 On 2023-03-21 6:18 a.m., Coding Hoodies wrote: Hi R Team!, We are opening a new start up soon, codinghoodies.com, we want to make coders feel stylish. Out of goodwill I wanted to ask you formally if I can have permission to use the standard R logo on the front of hoodies to sell? I have read your rules but wanted to ask as I feel a moral right to email you asking to show support and respect for the R project. If it makes it easier I could build send a picture of the hoodie with the logo on to you to see if this is acceptable. Arid Sweeting __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat .ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7Ctebert%40ufl.edu %7C99f01774c9f5452bd99a08db2a31ec23%7C0d4da0f84a314d76ace60a62331e1b84 %7C0%7C0%7C638150166126972400%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw MDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sda ta=p2ffNKEh6intBdGjjtr6jaaaRcdtiBw4iMI1CL6K9Xg%3D&reserved=0 PLEASE do read the posting guide https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r -project.org%2Fposting-guide.html&data=05%7C01%7Ctebert%40ufl.edu%7C99 f01774c9f5452bd99a08db2a31ec23%7C0d4da0f84a314d76ace60a62331e1b84%7C0% 7C0%7C638150166126972400%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiL CJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=bg OZVdlLFSw3mbQGmF0OLrMOVUcYonH9wHMN3Y2TqDM%3D&reserved=0 and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7Ctebert%40ufl.edu%7C99f01774c9f5452bd99a08db2a31ec23%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C638150166126972400%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=p2ffNKEh6intBdGjjtr6jaaaRcdtiBw4iMI1CL6K9Xg%3D&reserved=0 PLEASE do read the posting guide https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=
Re: [R] DOUBT
Dear Nandiniraj, Please cc r-help in your emails so that others can see what happened with your problem. You don't provide enough information to know what exactly is the source of your problem -- you're more likely to get effective help if you provide a minimal reproducible example of the problem -- but it's a good guess that the variable (HHsize or perhaps some other variable) isn't in the newdata data frame. Best, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://www.john-fox.ca/ On 2023-03-21 1:24 p.m., Nandini raj wrote: I removed space even though it is showing error. I.e Variable not found Nandiniraj On Tue, Mar 21, 2023, 10:36 PM John Fox <mailto:j...@mcmaster.ca>> wrote: Dear Nandini raj, You have a space in the variable name "HH size". I hope this helps, John John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ <https://socialsciences.mcmaster.ca/jfox/> On 2023-03-20 1:16 p.m., Nandini raj wrote: > Respected sir/madam > can you please suggest what is an unexpected symbol in the below code for > running a multinomial logistic regression > > model <- multinom(adoption ~ age + education + HH size + landholding + > Farmincome + nonfarmincome + creditaccesibility + LHI, data=newdata) > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org <mailto:R-help@r-project.org> mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help <https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html <http://www.R-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Good Will Legal Question
Dear Arid Sweeting, R-help is probably not the place to ask this question, although perhaps since you're seeking moral advice, people might want to say something. I would normally expect to see a query like this addressed to the R website webmasters, of which I'm one -- with the caveat that the R Foundation doesn't give legal advice. Just to be sure, you say that you read the rules for use of the R logo, so I assume that you've seen <https://www.r-project.org/logo/>, which seems entirely clear to me. I think that it's safe to say that if the R Foundation wanted to limit commercial use of the R logo, it wouldn't have released it under the CC-BY-SA 4.0 license. I'm not sure what moral issues concern you. I hope this helps, John John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 2023-03-21 6:18 a.m., Coding Hoodies wrote: Hi R Team!, We are opening a new start up soon, codinghoodies.com, we want to make coders feel stylish. Out of goodwill I wanted to ask you formally if I can have permission to use the standard R logo on the front of hoodies to sell? I have read your rules but wanted to ask as I feel a moral right to email you asking to show support and respect for the R project. If it makes it easier I could build send a picture of the hoodie with the logo on to you to see if this is acceptable. Arid Sweeting __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] DOUBT
Dear Nandini raj, You have a space in the variable name "HH size". I hope this helps, John John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 2023-03-20 1:16 p.m., Nandini raj wrote: Respected sir/madam can you please suggest what is an unexpected symbol in the below code for running a multinomial logistic regression model <- multinom(adoption ~ age + education + HH size + landholding + Farmincome + nonfarmincome + creditaccesibility + LHI, data=newdata) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Trying to learn how to write an "advanced" function
Although I owe thanks to Ramus and Ivan, I still do not know how to write and "advanced" function. My most recent try (after looking at the material Ramus and Ivan set) still does not work. I am trying to run the lm function on two different formulae: 1) y~x, 2) y~x+z Any corrections would be appreciated! Thank you, John doit <- function(x){ ds <- deparse(substitute(x)) cat("1\n") print(ds) eval(lm(quote(ds)),parent.frame()) } # define data that will be used in regression y <- 1:10 x <- y+rnorm(10) z <- c(rep(1,5),rep(2,5)) # Show what x, y and z look like rbind(x,y,z) # run formula y~x JD <- doit(y~x) JD # run formula y~x+z JD2 <- doit(y~x+z) JD2 From: R-help on behalf of Rasmus Liland Sent: Thursday, March 16, 2023 8:42 AM To: r-help Subject: Re: [R] Trying to learn how to write an "advanced" function On 2023-03-16 12:11 +, Sorkin, John wrote: > (1) can someone point me to an > explanation of match.call or match > that can be understood by the > uninitiated? Dear John, the man page ?match tells us that match matches the first vector against the second, and returns a vector of indecies the same length as the first, e.g. > match(c("formula", "data", "subset", "weights", "na.action", "offset"), c("Maryland", "formula", "data", "subset", "weights", "na.action", "offset", "Sorkin", "subset"), 0L) [1] 2 3 4 5 6 7 perhaps a bad answer ... > (2) can someone point me to a document > that will help me learn how to write > an "advanced" function? Perhaps the background here is looking at the lm function as a basis for writing something more advanced, then the exercise becomes looking at dput(lm), understanding every line by looking up all the functions you do not understand in the man pages e.g. ?match. Remember, you can search for things inside R by using double questionmark, ??match, finding versions of match existing inside other installed packages, e.g. raster::match and posterior::match, perhaps this exercise becomes writing ones own version of lm inside ones own package? Best, Rasmus __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7CJSorkin%40som.umaryland.edu%7C8e0b6e6627474ceceed608db261ca383%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638145676673285449%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=coJy2i9Nj%2Fs23ElOAM7kaYpTTBSKDo5B557tNf2twSA%3D&reserved=0 PLEASE do read the posting guide https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=05%7C01%7CJSorkin%40som.umaryland.edu%7C8e0b6e6627474ceceed608db261ca383%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638145676673285449%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=eByabBUy5c7zdefrSSq3xbgjMTcsxwbBGD33lTwv4Pg%3D&reserved=0 and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Trying to learn how to write an "advanced" function
I am trying to understand how to write an "advanced" function. To do so, I am examining the lm fucnction, a portion of which is pasted below. I am unable to understand what match.call or match does, and several other parts of lm, even when I read the help page for match.call or match. (1) can someone point me to an explanation of match.call or match that can be understood by the uninitiated? (2) can someone point me to a document that will help me learn how to write an "advanced" function? Thank you, John > lm function (formula, data, subset, weights, na.action, method = "qr", model = TRUE, x = FALSE, y = FALSE, qr = TRUE, singular.ok = TRUE, contrasts = NULL, offset, ...) { ret.x <- x ret.y <- y cl <- match.call() mf <- match.call(expand.dots = FALSE) m <- match(c("formula", "data", "subset", "weights", "na.action", "offset"), names(mf), 0L) __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Trying to learn how to write a function
I am trying to understand how to write an "advanced" function. To do this, I am examining the code of lm, a small part of the lm code is below. N > lm function (formula, data, subset, weights, na.action, method = "qr", model = TRUE, x = FALSE, y = FALSE, qr = TRUE, singular.ok = TRUE, contrasts = NULL, offset, ...) { ret.x <- x ret.y <- y cl <- match.call() mf <- match.call(expand.dots = FALSE) m <- match(c("formula", "data", "subset", "weights", "na.action", "offset"), names(mf), 0L) __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] tcl tk: set the position button
Dear Rodrigo, Try tkwm.geometry(win1, "-0+0"), which should position win1 at the top right. I hope this helps, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 2023-03-12 8:41 p.m., Rodrigo Badilla wrote: Hi all, I am using tcltk2 library to show buttons and messages. Everything work fine but I would like set the tk2button to the right of my screen, by default it display at the left of my screen. my script example: library(tcltk2) win1 <- tktoplevel() butOK <- tk2button(win1, text = "TEST", width = 77) tkgrid(butOK) Thanks in advance Saludos Rodrigo -- Este correo electrónico ha sido analizado en busca de virus por el software antivirus de Avast. www.avast.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Shaded area
As Peter says, the list is very cautious about what types of files it allows. A handy way to supply some sample data is the dput() function. In the case of a large dataset something like dput(head(mydata, 100)) should supply the data we need. Just do dput(mydata) where *mydata* is your data. Copy the output and paste it here. On Wed, 1 Mar 2023 at 09:58, PIKAL Petr wrote: > Hallo > > Excel attachment is not allowed here, but shading area is answered many > times elsewhere. Use something like . "shading area r" in google. > > See eg. > https://www.geeksforgeeks.org/how-to-shade-a-graph-in-r/ > > Cheers Petr > > -Original Message- > From: R-help On Behalf Of George Brida > Sent: Wednesday, March 1, 2023 3:21 PM > To: r-help@r-project.org > Subject: [R] Shaded area > > Dear R users, > > I have an xlsx file (attached to this mail) that shows the values of a > "der" series observed on a daily basis from January 1, 2017 to January 25, > 2017. This series is strictly positive during two periods: from January 8, > 2017 to January 11, 2017 and from January 16, 2017 to January 20, 2017. I > would like to plot the series with two shaded areas corresponding to the > positivity of the series. Specifically, I would like to draw 4 vertical > lines intersecting the x-axis in the 4 dates mentioned above and shade the > two areas of positivity. Thanks for your help. > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních > partnerů PRECHEZA a.s. jsou zveřejněny na: > https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information > about processing and protection of business partner’s personal data are > available on website: > https://www.precheza.cz/en/personal-data-protection-principles/ > Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou > důvěrné a podléhají tomuto právně závaznému prohláąení o vyloučení > odpovědnosti: https://www.precheza.cz/01-dovetek/ | This email and any > documents attached to it may be confidential and are subject to the legally > binding disclaimer: https://www.precheza.cz/en/01-disclaimer/ > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- John Kane Kingston ON Canada [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Generic Function read?
Have a look at the {rio} package. On Tue, 28 Feb 2023 at 15:00, Leonard Mada via R-help wrote: > Dear R-Users, > > I noticed that *read* is not a generic function. Although it could > benefit from the functionality available for generic functions: > > read = function(file, ...) UseMethod("read") > > methods(read) > # [1] read.csv read.csv2read.dcf read.delim read.delim2 > read.DIF read.fortran > # [8] read.ftable read.fwf read.socket read.table > > The users would still need to call the full function name. But it seems > useful to be able to find rapidly what formats can be read; including > with other packages (e.g. for Excel, SAS, ... - although most packages > do not adhere to the generic naming convention, but maybe they will > change in the future). > > Note: > This should be possible (even though impractical), but actually does NOT > work: > read = function(file, ...) UseMethod("read") > file = "file.csv" > class(file) = c("csv", class(file)); > read(file) > > Should it not work? > > Sincerely, > > Leonard > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- John Kane Kingston ON Canada [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MFA variables graph, filtered by separate.analyses
Dear gavin, I think that it's likely that Jim meant the hetcor() function in the polycor package. Best, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 2023-02-21 5:42 p.m., gavin duley wrote: Hi Jim, On Tue, 21 Feb 2023 at 22:17, Jim Lemon wrote: I can't work through this right now, but I would start by looking at the 'hetcor' package to get the correlations, or if they are already in the return object, build a plot from these. Thanks for the suggestion. I'll read up on the 'hetcor' package. Thanks, gavin, __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bug in R-Help Archives?
My apologies, I did not mean to be part of the discussion. If there is such a thing as a pocket email (similar to a pocket dial) the email would be classified as a pocket email. John From: R-help on behalf of Rui Barradas Sent: Friday, January 27, 2023 10:15 AM To: Ivan Krylov Cc: R-help Mailing List Subject: Re: [R] Bug in R-Help Archives? Às 07:36 de 27/01/2023, Ivan Krylov escreveu: > On Fri, 27 Jan 2023 13:01:39 +0530 > Deepayan Sarkar wrote: > >> From looking at the headers in John Sorkin's mail, my guess is that he >> just replied to the other thread rather than starting a fresh email, >> and in his attempts to hide that, was outsmarted by Outlook. > > That's 100% correct. The starting "Pipe operator" e-mail has > In-Reply-To: <047e01d91ed5$577e42a0$067ac7e0$@yahoo.com>, and the > message with this Message-ID is the one from Mukesh Ghanshyamdas > Lekhrajani with the subject "Re: [R] R Certification" that's > immediately above the message by John Sorkin. > Thanks, I was searching the archives for something else, stumbled on that and forgot to look at the heders. Good news there's nothing wrong with R-Help. Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7CJSorkin%40som.umaryland.edu%7Ca90bca3f346f470c472808db007a65cd%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638104297929279937%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=8CGlDg%2Fdkx28raPOalXjZ7NqN%2BP%2BoWo9UFL%2Boc6NBRU%3D&reserved=0 PLEASE do read the posting guide https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=05%7C01%7CJSorkin%40som.umaryland.edu%7Ca90bca3f346f470c472808db007a65cd%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638104297929279937%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=mira%2F3jlC1V3jAJvBiqw53EpaCJknQ1W77NY7jTzfyA%3D&reserved=0 and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] return value of {....}
Avi, Please do not mistake my posting as being a BASHING of R. I greatly admire R and the progress it has made from its roots in S. I thank the may people who contribute to the development and growth of R. Just because a language allows a given syntax does not mean (1) that the language is bad or (2) that the syntax should be used (except in rare occasions). There may well be a few occasions when using a global variable in a function makes sense and in that instance the global variable should be used, and the usage should be documented in the comments that are part of the source code. Please note that I stated "A general recommendation use of a global variable in a function"; I used the words "recommendation is to AVOID". I did not, and would not forbid the use of a global variable in a function. English has the word "or" which is not clearly defined; is it an exclusive or or an inclusive or? I don't bash English because of this semantic ambiguity. What I try to do is make certain that when it is essential to understand if the "or" I use in a sentence in exclusive vs. inclusive (or conversely) I make certain my meaning is clear. e.g. You can use bleach or ammonia to clean the stain, but NEVER use ammonia and bleach together as the combination produces a deadly gas. I try to follow this philosophy when I program. I don't use global variables in a function unless there is an overwhelming reason to do so. When I do, I indicate that that a global variable has been used in my comments. In the same vane, I rarely use call by reference in my programs (when this is allowed by a programming language); I try to use call by value whenever possible as call by reference can be fraught. On the other hand when working with extremely large data objects (especially in the old days when I had perhaps 20k of memory rather than 100 gig), I have used call by reference to save storage. Despite this, I know that call by reference is not recommended just as using a global variable in a function is not recommended, but can, and should be used when needed. John From: R-help on behalf of avi.e.gr...@gmail.com Sent: Sunday, January 15, 2023 10:53 PM Cc: 'R help Mailing list' Subject: Re: [R] return value of {} Again, John, we are comparing different designs in languages that are often decades old and partially retrofitted selectively over the years. Is it poor form to use global variables? Many think so. Discussions have been had on how to use variables hidden in various ways that are not global, such as within a package. But note R still has global assignment operators like <<- and its partner ->> that explicitly can even create a global variable that did not exist before the function began and that persists for that session. This is perhaps a special case of the assign() function which can do the same for any designated environment. Although it may sometimes be better form to avoid things like this, it can also be worse form when you want to do something decentralized with no control over passing additional arguments to all kinds of functions. Some languages try to finesse their way past this by creating concepts like closures that hold values and can even change the values without them being globally visible. Some may use singleton objects or variables that are part of a class rather than a single object (which is again a singleton.) So is the way R allows really a bad thing, especially if rarely used? All I know is MANY languages use scoping including functions that declare a variable then create an inner function or many and return the inner function(s) to the outside where the one getting it can later use that function and access the variable and even use it as a way to communicate with the multiple functions it got that were created in that incubator. Nifty stuff but arguably not always as easy to comprehend! This forum is not intended for BASHING any language, certainly not R. There are many other languages to choose from and every one of them will have some things others consider flaws. How many opted out of say a ++ operator as in y = x++ for fairly good reasons and later added something like the Walrus operator so you can now write y = (x := x + 1) as a way to do the same thing and other things besides? But to address your point, about a variable outside a function as defined in a set of environments to search that includes a global context, I want to note that it is just a set of maskings and your variable "x" can appear in EVERY environment above you and you can get in real trouble if the order the environments are put in place changes in some way. The arguably safer way would be to get a specific value of x would be to not ask for it directly but as get("x", envir=...) and specify the specific environment that ideally is in existence
Re: [R] return value of {....}
Richard, I sent my prior email too quickly: A slight addition to your code shows an important aspect of R, local vs. global variables: x <- 137 f <- function () { a <- x x <- 42 b <- x list(a=a, b=b) } f() print(x) When run the program produces the following: > x <- 137 > f <- function () { +a <- x +x <- 42 +b <- x +list(a=a, b=b) +} > f() $a [1] 137 $b [1] 42 > print(x) [1] 137 The fist x, a <- x, invokes an x variable that is GLOBAL. It is known both inside and outside the function. The second x, x <- 42, defines an x that is LOCAL to the function, it is not known to the program that called the function. The LOCAL value of x is used in the expression b <- x. As can be seen by the print(x) statement, the LOCAL value of x is NOT known by the program that calls the function. The class of a variable, scoping (i.e. local vs. variable) can be a source of subtle programming errors. A general recommendation is to AVOID use of a global variable in a function, i.e. don't use a variable in function that is not passed as a parameter to the function (as was done in the function above in the statment a <- x). If you need to use a variable in a function that is known by the program that calls the function, pass the variable as a argument to the function e.g. Use this code: # Set values needed by function y <- 2 b <- 30 myfunction <- function(a,b){ cat("a=",a,"b=",b,"\n") y <- a y2 <- y+b cat("y=",y,"y2=",y2,"\n") } # Call the function and pass all needed values to the function myfunction(y,b) Don't use the following code that depends on a global value that is known to the function, but not passed as a parameter to the function: y <- 2 myNGfunction <- function(a){ cat("a=",a,"b=",b,"\n") y <- a y2 <- y+b cat("y=",y,"y2=",y2,"\n") } # b is a global variable and will be know to the function, # but should be passed as a parameter as in example above. b <- 100 myNGfunction(y) John From: R-help on behalf of Sorkin, John Sent: Sunday, January 15, 2023 7:40 PM To: Richard O'Keefe; Valentin Petzel Cc: R help Mailing list Subject: Re: [R] return value of {} Richard, A slight addition to your code shows an important aspect of R, local vs. global variables: x <- 137 f <- function () { a <- x x <- 42 b <- x list(a=a, b=b) } f() print(x) From: R-help on behalf of Richard O'Keefe Sent: Sunday, January 15, 2023 6:39 PM To: Valentin Petzel Cc: R help Mailing list Subject: Re: [R] return value of {} I wonder if the real confusino is not R's scope rules? (begin .) is not Lisp, it's Scheme (a major Lisp dialect), and in Scheme, (begin (define x ...) (define y ...) ...) declares variables x and y that are local to the (begin ...) form, just like Algol 68. That's weirdness 1. Javascript had a similar weirdness, when the ECMAscript process eventually addressed. But the real weirdness in R is not just that the existence of variables is indifferent to the presence of curly braces, it's that it's *dynamic*. In f <- function (...) { ... use x ... x <- ... ... use x ... } the two occurrences of "use x" refer to DIFFERENT variables. The first occurrence refers to the x that exists outside the function. It has to: the local variable does not exist yet. The assignment *creates* the variable, so the second occurrence of "use x" refers to the inner variable. Here's an actual example. > x <- 137 > f <- function () { + a <- x + x <- 42 + b <- x + list(a=a, b=b) + } > f() $a [1] 137 $b [1] 42 Many years ago I set out to write a compiler for R, and this was the issue that finally sank my attempt. It's not whether the occurrence of "use x" is *lexically* before the creation of x. It's when the assignment is *executed* that makes the difference. Different paths of execution through a function may result in it arriving at its return point with different sets of local variables. R is the only language I routinely use that does this. So rule 1: whether an identifier in an R function refers to an outer variable or a local variable depends on whether an assignment creating that local variable has been executed yet. And rule 2: the scope of a local variable is the whole function. If the following transcript not only makes sense to you, but is exactly what you expect, congratulations, you understand local variables in R. > x <- 0 > g <- function () { + n <- 10 + r <- numeric(n) + for (i in 1:n) { + if (i == 6) x
Re: [R] return value of {....}
Richard, A slight addition to your code shows an important aspect of R, local vs. global variables: x <- 137 f <- function () { a <- x x <- 42 b <- x list(a=a, b=b) } f() print(x) From: R-help on behalf of Richard O'Keefe Sent: Sunday, January 15, 2023 6:39 PM To: Valentin Petzel Cc: R help Mailing list Subject: Re: [R] return value of {} I wonder if the real confusino is not R's scope rules? (begin .) is not Lisp, it's Scheme (a major Lisp dialect), and in Scheme, (begin (define x ...) (define y ...) ...) declares variables x and y that are local to the (begin ...) form, just like Algol 68. That's weirdness 1. Javascript had a similar weirdness, when the ECMAscript process eventually addressed. But the real weirdness in R is not just that the existence of variables is indifferent to the presence of curly braces, it's that it's *dynamic*. In f <- function (...) { ... use x ... x <- ... ... use x ... } the two occurrences of "use x" refer to DIFFERENT variables. The first occurrence refers to the x that exists outside the function. It has to: the local variable does not exist yet. The assignment *creates* the variable, so the second occurrence of "use x" refers to the inner variable. Here's an actual example. > x <- 137 > f <- function () { + a <- x + x <- 42 + b <- x + list(a=a, b=b) + } > f() $a [1] 137 $b [1] 42 Many years ago I set out to write a compiler for R, and this was the issue that finally sank my attempt. It's not whether the occurrence of "use x" is *lexically* before the creation of x. It's when the assignment is *executed* that makes the difference. Different paths of execution through a function may result in it arriving at its return point with different sets of local variables. R is the only language I routinely use that does this. So rule 1: whether an identifier in an R function refers to an outer variable or a local variable depends on whether an assignment creating that local variable has been executed yet. And rule 2: the scope of a local variable is the whole function. If the following transcript not only makes sense to you, but is exactly what you expect, congratulations, you understand local variables in R. > x <- 0 > g <- function () { + n <- 10 + r <- numeric(n) + for (i in 1:n) { + if (i == 6) x <- 100 + r[i] <- x + i + } + r + } > g() [1] 1 2 3 4 5 106 107 108 109 110 On Fri, 13 Jan 2023 at 23:28, Valentin Petzel wrote: > Hello Akshay, > > R is quite inspired by LISP, where this is a common thing. It is not in > fact that {...} returned something, rather any expression evalulates to > some value, and for a compound statement that is the last evaluated > expression. > > {...} might be seen as similar to LISPs (begin ...). > > Now this is a very different thing compared to {...} in something like C, > even if it looks or behaves similarly. But in R {...} is in fact an > expression and thus has evaluate to some value. This also comes with some > nice benefits. > > You do not need to use {...} for anything that is a single statement. But > you can in each possible place use {...} to turn multiple statements into > one. > > Now think about a statement like this > > f <- function(n) { > x <- runif(n) > x**2 > } > > Then we can do > > y <- f(10) > > Now, you suggested way would look like this: > > f <- function(n) { > x <- runif(n) > y <- x**2 > } > > And we'd need to do something like: > > f(10) > y <- somehow_get_last_env_of_f$y > > So having a compound statement evaluate to a value clearly has a benefit. > > Best Regards, > Valentin > > 09.01.2023 18:05:58 akshay kulkarni : > > > Dear Valentin, > > But why should {} "return" a value? It > could just as well evaluate all the expressions and store the resulting > objects in whatever environment the interpreter chooses, and then it would > be left to the user to manipulate any object he chooses. Don't you think > returning the last, or any value, is redundant? We are living in the > 21st century world, and the R-core team might,I suppose, have a definite > reason for"returning" the last value. Any comments? > > > > Thanking you, > > Yours sincerely, > > AKSHAY M KULKARNI > > > > > > *From:* Valentin Petzel > > *Sent:* Monday, January 9, 2023 9:18 PM > > *To:* akshay kulkarni > > *Cc:* R help Mailing list > > *Subject:* Re: [R] return value of {} > > > > Hello Akshai, > > > > I think you are confusing {...} with local({...}). This one will > evaluate the expression in a separate environment, returning the last > expression. > > > > {...} simply evaluates multiple expressions as one and returns the > result of the last line, but it still evaluates each expression. > > > > Assignment returns the assigned value, so we can chain assignments like > this > > > > a <- 1 + (b <- 2) > > > > convenien
Re: [R] Removing variables from data frame with a wile card
I am new to this thread. At the risk of presenting something that has been shown before, below I demonstrate how a column in a data frame can be dropped using a wild card, i.e. a column whose name starts with "th" using nothing more than base r functions and base R syntax. While additions to R such as tidyverse can be very helpful, many things that they do can be accomplished simply using base R. # Create data frame with three columns one <- rep(1,10) one two <- rep(2,10) two three <- rep(3,10) three mydata <- data.frame(one=one, two=two, three=three) cat("Data frame with three columns\n") mydata # Drop the column whose name starts with th, i.e. column three # Find the location of the column ColumToDelete <- grep("th",colnames((mydata))) cat("The colomumn to be dropped is the column called three, which is column",ColumToDelete,"\n") ColumToDelete # Drop the column whose name starts with "th" newdata2 <- mydata[,-ColumnToDelete] cat("Data frame after droping column whose name is three\n") newdata2 I hope this helps. John From: R-help on behalf of Valentin Petzel Sent: Saturday, January 14, 2023 1:21 PM To: avi.e.gr...@gmail.com Cc: 'R-help Mailing List' Subject: Re: [R] Removing variables from data frame with a wile card Hello Avi, while something like d$something <- ... may seem like you're directly modifying the data it does not actually do so. Most R objects try to be immutable, that is, the object may not change after creation. This guarantees that if you have a binding for same object the object won't change sneakily. There is a data structure that is in fact mutable which are environments. For example compare L <- list() local({L$a <- 3}) L$a with E <- new.env() local({E$a <- 3}) E$a The latter will in fact work, as the same Environment is modified, while in the first one a modified copy of the list is made. Under the hood we have a parser trick: If R sees something like f(a) <- ... it will look for a function f<- and call a <- f<-(a, ...) (this also happens for example when you do names(x) <- ...) So in fact in our case this is equivalent to creating a copy with removed columns and rebind the symbol in the current environment to the result. The data.table package breaks with this convention and uses C based routines that allow changing of data without copying the object. Doing d[, (cols_to_remove) := NULL] will actually change the data. Regards, Valentin 14.01.2023 18:28:33 avi.e.gr...@gmail.com: > Steven, > > Just want to add a few things to what people wrote. > > In base R, the methods mentioned will let you make a copy of your original DF > that is missing the items you are selecting that match your pattern. > > That is fine. > > For some purposes, you want to keep the original data.frame and remove a > column within it. You can do that in several ways but the simplest is > something where you sat the column to NULL as in: > > mydata$NAME <- NULL > > using the mydata["NAME"] notation can do that for you by using a loop of > unctional programming method that does that with all components of your grep. > > R does have optimizations that make this less useful as a partial copy of a > data.frame retains common parts till things change. > > For those who like to use the tidyverse, it comes with lots of tools that let > you select columns that start with or end with or contain some pattern and I > find that way easier. > > > > -Original Message- > From: R-help On Behalf Of Steven Yen > Sent: Saturday, January 14, 2023 7:49 AM > To: Andrew Simmons > Cc: R-help Mailing List > Subject: Re: [R] Removing variables from data frame with a wile card > > Thanks to all. Very helpful. > > Steven from iPhone > >> On Jan 14, 2023, at 3:08 PM, Andrew Simmons wrote: >> >> You'll want to use grep() or grepl(). By default, grep() uses >> extended regular expressions to find matches, but you can also use >> perl regular expressions and globbing (after converting to a regular >> expression). >> For example: >> >> grepl("^yr", colnames(mydata)) >> >> will tell you which 'colnames' start with "yr". If you'd rather you >> use globbing: >> >> grepl(glob2rx("yr*"), colnames(mydata)) >> >> Then you might write something like this to remove the columns starting with >> yr: >> >> mydata <- mydata[, !grepl("^yr", colnames(mydata)), drop = FALSE] >> >>> On Sat, Jan 14, 2023 at 1:56 AM Steven T. Yen wrote: >>> >>> I have a data frame containing variables &
Re: [R] Removing variables from data frame with a wile card
You rang sir? library(tidyverse) xx = 1:10 yr1 = yr2 = yr3 = rnorm(10) dat1 <- data.frame(xx , yr1, yr2, y3) dat1 %>% select(!starts_with("yr")) or for something a bit more exotic as I have been trying to learn a bit about the "data.table package library(data.table) xx = 1:10 yr1 = yr2 = yr3 = rnorm(10) dat2 <- data.table(xx , yr1, yr2, yr3) dat2[, !names(dat2) %like% "yr", with=FALSE ] On Sat, 14 Jan 2023 at 12:28, wrote: > Steven, > > Just want to add a few things to what people wrote. > > In base R, the methods mentioned will let you make a copy of your original > DF that is missing the items you are selecting that match your pattern. > > That is fine. > > For some purposes, you want to keep the original data.frame and remove a > column within it. You can do that in several ways but the simplest is > something where you sat the column to NULL as in: > > mydata$NAME <- NULL > > using the mydata["NAME"] notation can do that for you by using a loop of > unctional programming method that does that with all components of your > grep. > > R does have optimizations that make this less useful as a partial copy of > a data.frame retains common parts till things change. > > For those who like to use the tidyverse, it comes with lots of tools that > let you select columns that start with or end with or contain some pattern > and I find that way easier. > > > > -Original Message- > From: R-help On Behalf Of Steven Yen > Sent: Saturday, January 14, 2023 7:49 AM > To: Andrew Simmons > Cc: R-help Mailing List > Subject: Re: [R] Removing variables from data frame with a wile card > > Thanks to all. Very helpful. > > Steven from iPhone > > > On Jan 14, 2023, at 3:08 PM, Andrew Simmons wrote: > > > > You'll want to use grep() or grepl(). By default, grep() uses > > extended regular expressions to find matches, but you can also use > > perl regular expressions and globbing (after converting to a regular > expression). > > For example: > > > > grepl("^yr", colnames(mydata)) > > > > will tell you which 'colnames' start with "yr". If you'd rather you > > use globbing: > > > > grepl(glob2rx("yr*"), colnames(mydata)) > > > > Then you might write something like this to remove the columns starting > with yr: > > > > mydata <- mydata[, !grepl("^yr", colnames(mydata)), drop = FALSE] > > > >> On Sat, Jan 14, 2023 at 1:56 AM Steven T. Yen wrote: > >> > >> I have a data frame containing variables "yr3",...,"yr28". > >> > >> How do I remove them with a wild cardsomething similar to "del yr*" > >> in Windows/doc? Thank you. > >> > >>> colnames(mydata) > >> [1] "year" "weight" "confeduc" "confothr" "college" > >> [6] ... > >> [41] "yr3""yr4""yr5""yr6" "yr7" > >> [46] "yr8""yr9""yr10" "yr11" "yr12" > >> [51] "yr13" "yr14" "yr15" "yr16" "yr17" > >> [56] "yr18" "yr19" "yr20" "yr21" "yr22" > >> [61] "yr23" "yr24" "yr25" "yr26" "yr27" > >> [66] "yr28"... > >> > >> __ > >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- John Kane Kingston ON Canada [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Pipe operator
Jeff, Thank you for contributing important information to this thread. From: Jeff Newmiller Sent: Tuesday, January 3, 2023 2:07 PM To: r-help@r-project.org; Sorkin, John; Ebert,Timothy Aaron; 'R-help Mailing List' Subject: Re: [R] Pipe operator The other responses here have been very good, but I felt it necessary to point out that the concept of a pipe originated around when you started programming [1] (text based). It did take awhile for it to migrate into programming languages such as OCaml, but Powershell makes extensive use of (object-based) pipes. Re memory use: not so much. Variables are small... it is the data they point to that is large, and it is not possible to analyze data without storing it somewhere. But when the variables are numerous they can interfere with our ability to understand the program... using pipes lets us focus on results obtained after several steps so fewer intermediate values clutter the variable space. Re speed: the magrittr pipe (%>%) is much slower than the built-in pipe at coordinating the transfer of data from left to right, but that is not usually significant compared to the computation speed on the actual data in the functions. [1] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fen.m.wikipedia.org%2Fwiki%2FPipeline_&data=05%7C01%7Cjsorkin%40som.umaryland.edu%7C94e1ec7b93c642286aae08daedbdc79f%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638083696601759531%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=gdooVKcK8iDNN0X6ZaYmDNk9pQ1Pe%2BgQiUGioPGB%2Fps%3D&reserved=0(Unix)#:~:text=The%20concept%20of%20pipelines%20was,Ritchie%20%26%20Thompson%2C%201974). On January 3, 2023 9:13:22 AM PST, "Sorkin, John" wrote: >Tim, > >Thank you for your reply. I did not know about the |> operator. Do both %>% >and |> work in base R? > >You suggested that the pipe operator can produce code with fewer variables. >May I ask you to send a short example in which the pipe operator saves >variables. Does said saving of variables speed up processing or result in less >memory usage? > >Thank you, >John > >____ >From: Ebert,Timothy Aaron >Sent: Tuesday, January 3, 2023 12:07 PM >To: Sorkin, John; 'R-help Mailing List' >Subject: RE: Pipe operator > >The pipe shortens code and results in fewer variables because you do not have >to save intermediate steps. Once you get used to the idea it is useful. Note >that there is also the |> pipe that is part of base R. As far as I know it >does the same thing as %>%, or at my level of programing I have not >encountered a difference. > >Tim > >-Original Message- >From: R-help On Behalf Of Sorkin, John >Sent: Tuesday, January 3, 2023 11:49 AM >To: 'R-help Mailing List' >Subject: [R] Pipe operator > >[External Email] > >I am trying to understand the reason for existence of the pipe operator, %>%, >and when one should use it. It is my understanding that the operator sends the >file to the left of the operator to the function immediately to the right of >the operator: > >c(1:10) %>% mean results in a value of 5.5 which is exactly the same as the >result one obtains using the mean function directly, viz. mean(c(1:10)). What >is the reason for having two syntactically different but semantically >identical ways to call a function? Is one more efficient than the other? Does >one use less memory than the other? > >P.S. Please forgive what might seem to be a question with an obvious answer. I >am a programmer dinosaur. I have been programming for more than 50 years. When >I started programming in the 1960s the only pipe one spoke about was a bong. > >John > >__ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7Cjsorkin%40som.umaryland.edu%7C94e1ec7b93c642286aae08daedbdc79f%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638083696601759531%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=jQx8iLm1i%2BQky6NTJ05AmhH6Fb6gJScFuafmEEFs2nM%3D&reserved=0 >PLEASE do read the posting guide >https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=05%7C01%7Cjsorkin%40som.umaryland.edu%7C94e1ec7b93c642286aae08daedbdc79f%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638083696601759531%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=jHwquXRkVY6hOIB7dKo4jcEiuA%
Re: [R] Pipe operator
Tim, Thank you for your reply. I did not know about the |> operator. Do both %>% and |> work in base R? You suggested that the pipe operator can produce code with fewer variables. May I ask you to send a short example in which the pipe operator saves variables. Does said saving of variables speed up processing or result in less memory usage? Thank you, John From: Ebert,Timothy Aaron Sent: Tuesday, January 3, 2023 12:07 PM To: Sorkin, John; 'R-help Mailing List' Subject: RE: Pipe operator The pipe shortens code and results in fewer variables because you do not have to save intermediate steps. Once you get used to the idea it is useful. Note that there is also the |> pipe that is part of base R. As far as I know it does the same thing as %>%, or at my level of programing I have not encountered a difference. Tim -Original Message- From: R-help On Behalf Of Sorkin, John Sent: Tuesday, January 3, 2023 11:49 AM To: 'R-help Mailing List' Subject: [R] Pipe operator [External Email] I am trying to understand the reason for existence of the pipe operator, %>%, and when one should use it. It is my understanding that the operator sends the file to the left of the operator to the function immediately to the right of the operator: c(1:10) %>% mean results in a value of 5.5 which is exactly the same as the result one obtains using the mean function directly, viz. mean(c(1:10)). What is the reason for having two syntactically different but semantically identical ways to call a function? Is one more efficient than the other? Does one use less memory than the other? P.S. Please forgive what might seem to be a question with an obvious answer. I am a programmer dinosaur. I have been programming for more than 50 years. When I started programming in the 1960s the only pipe one spoke about was a bong. John __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7Cjsorkin%40som.umaryland.edu%7Cdc0d677272114cf6ba2808daedad0ec5%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638083624783034240%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=7dDMSg%2FmPQ5xXP6zu6MWLmARdtdlrYWb3mXPZQj0La0%3D&reserved=0 PLEASE do read the posting guide https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=05%7C01%7Cjsorkin%40som.umaryland.edu%7Cdc0d677272114cf6ba2808daedad0ec5%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638083624783034240%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=l5BZyjup%2Bho%2FijE1zQMxb5JE3F5VfKBZpUKHYW4k4Fg%3D&reserved=0 and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Pipe operator
I am trying to understand the reason for existence of the pipe operator, %>%, and when one should use it. It is my understanding that the operator sends the file to the left of the operator to the function immediately to the right of the operator: c(1:10) %>% mean results in a value of 5.5 which is exactly the same as the result one obtains using the mean function directly, viz. mean(c(1:10)). What is the reason for having two syntactically different but semantically identical ways to call a function? Is one more efficient than the other? Does one use less memory than the other? P.S. Please forgive what might seem to be a question with an obvious answer. I am a programmer dinosaur. I have been programming for more than 50 years. When I started programming in the 1960s the only pipe one spoke about was a bong. John __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R Certification
Hi Mukesh, Have a look at the blurb that prints at the start-up of R. "R is free software and comes with ABSOLUTELY NO WARRANTY." This is a hint that the R-Project is unlikely to be issuing certificates. On Mon, 2 Jan 2023 at 08:18, Mukesh Ghanshyamdas Lekhrajani via R-help < r-help@r-project.org> wrote: > Thanks Petr ! > > I will look at other training bodies as Coursera, or few others... but I > was just wondering if there could be a certificate from the "originators" > itself, I mean an "R" certificate from "r-project" itself and that would > carry more importance than external / unauthorized certificate bodies. > > But, if you suggest there is no such certification provided by > "r-project", then the only option for me is to search else where like - > Coursera or few others. > > I now have got my answers, but later the day - if ever "r-project" comes > up with "R Language" certifications, do keep me informed. > > > Thanks, Mukesh > 9819285174. > > > -Original Message- > From: PIKAL Petr > Sent: Monday, January 2, 2023 6:13 PM > To: mukesh.lekhraj...@yahoo.com; R-help Mailing List > > Subject: RE: [R] R Certification > > Hallo Mukesh > > R project is not Microsoft or Oracle AFAIK. But if you need some > certificate you could take courses on Coursera, they are offering > certificates. > > Cheers > Petr > > > -Original Message- > > From: R-help On Behalf Of Mukesh > > Ghanshyamdas Lekhrajani via R-help > > Sent: Monday, January 2, 2023 1:04 PM > > To: 'Jeff Newmiller' ; 'Mukesh Ghanshyamdas > > Lekhrajani via R-help' ; r-help@r-project.org > > Subject: Re: [R] R Certification > > > > Hello Jeff ! > > > > Yes, you are right.. and that’s why I am asking this question - just > like other > > governing bodies that issue certification on their respective > technologies, does > > "r-project.org" also have a learning path ? and then a certification. > > > > Say - Microsoft issues certificate for C#, .Net, etc.. > > Then, Oracle issues certificates for Java, DB etc.. > > > > These are authentic governing bodies for learning and issuing > certificates > > > > On exactly similar lines - "r-project.org" would also be having some > learning > > path and then let "r-project" take the proctored exam and issue a > certificate... > > > > I am not looking at any external institute for certifying me on "R" - > but, the > > governing body itself.. > > > > So, the question again is - "does r-project provide a learning path and > issue > > certificate after taking exams" > > > > Thanks, Mukesh > > 9819285174 > > > > > > > > -Original Message- > > From: Jeff Newmiller > > Sent: Monday, January 2, 2023 2:26 PM > > To: mukesh.lekhraj...@yahoo.com; Mukesh Ghanshyamdas Lekhrajani via R- > > help ; r-help@r-project.org > > Subject: Re: [R] R Certification > > > > I think this request is like saying "I want a unicorn." There are many > > organizations that will enter your name into a certificate form for a > fee, possibly > > with some credibility... but if they put "r-project.org" down as the > name of the > > organization granting this "certificate" then you are probably getting > fooled. > > > > On December 30, 2022 8:33:09 AM PST, Mukesh Ghanshyamdas Lekhrajani via > > R-help wrote: > > >Hello R Support Team, > > > > > > > > > > > >I want to do R certification, could you help me with the list of > > >certificates with their prices so it helps me to register. > > > > > > > > > > > >I want to do the certification directly from the governing body > > >"r-project.org" and not from any 3rd party. > > > > > > > > > > > >Please help. > > > > > > > > > > > > > > > > > > > > > > > >Mukesh > > > > > >+91 9819285174 > > > > > > > > > [[alternative HTML version deleted]] > > > > > >__ > > >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > >https://stat.ethz.ch/mailman/listinfo/r-help > > >PLEASE do read the posting guide > > >http://www.R-project.org/posting-guide.html > > >and provide commented, minimal, self-contained, reproduc
Re: [R] Reg: Help in assigning colors to factor variable in ggplot2
Here is a rough guess at what you may want with a bit of mock data and using ggplot2. ##=# library(ggplot2) library(RColorBrewer) dat1 <- data.frame(aa = sample(1:10, 20, replace = TRUE), bb = sample(21:30, 20, replace = TRUE), outcome = sample(c("died", "home", "other hospital","secondary care/rehab"), 20, replace = TRUE )) ggplot(dat1, aes(aa, bb, colour = outcome)) + geom_point() + scale_colour_brewer(palette = "Dark2") + labs( x = "Maximum body temperature", y = "Maximum heart rate", colour = "outcome", title ="500 ICU patients" ) #### On Mon, 26 Dec 2022 at 09:46, John Kane wrote: > I suspect you may be mixing *plot()* commands with *ggplot()* commands and > they are likely incompatible. > > Could you supply some sample data and any error messages that you are > getting? A handy way to supply some sample data is the dput() function. > In the case of a large dataset something like dput(head(mydata, 100)) > should supply the data we need. > > On Mon, 26 Dec 2022 at 09:06, Upananda Pani > wrote: > >> Dear All, >> >> I am trying to plot a scatter plot between temperature and heart rate and >> additionally marking the outcome of the patients by colors. I am using the >> standard package Use the standard function plot as well as the functions >> of >> package "ggplot2" (Wickham (2009)). Save the plots in pdf files. >> >> I am geeting an error to plot when assigning colsOutcome to the >> scatterplot. I am doing it wrongly. Please advise me. >> ```{r} >> library(ggplot2) >> library(RColorBrewer) >> library(ggsci) >> ICUData <- read.csv(file = "ICUData.csv") >> ``` >> ```{r} >> ## Generate empty vector >> colsOutcome <- character(nrow(ICUData)) >> ## Fill with colors >> colsOutcome[ICUData$outcome == "died"] <- "#E41A1C" >> colsOutcome[ICUData$outcome == "home"] <- "#377EB8" >> colsOutcome[ICUData$outcome == "other hospital"] <- "#4DAF4A8" >> colsOutcome[ICUData$outcome == "secondary care/rehab"] <- "#984EA3" >> ``` >> >> ```{r} >> plot(x = ICUData$temperature, y = ICUData$heart.rate, pch = 19, >> xlab = "Maximum body temperature", ylab = "Maximum heart rate", >> main = "500 ICU patients", col = colsOutcome, xlim = c(33,43)) >> legend(x = "topleft", legend = c("died", "home", "other hospital", >> "secondary care/rehab"), pch = 19, >>col = c("#E41A1C", "#377EB8", "#4DAF4A8", "#984EA3")) >> ``` >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > -- > John Kane > Kingston ON Canada > -- John Kane Kingston ON Canada [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reg: Help in assigning colors to factor variable in ggplot2
I suspect you may be mixing *plot()* commands with *ggplot()* commands and they are likely incompatible. Could you supply some sample data and any error messages that you are getting? A handy way to supply some sample data is the dput() function. In the case of a large dataset something like dput(head(mydata, 100)) should supply the data we need. On Mon, 26 Dec 2022 at 09:06, Upananda Pani wrote: > Dear All, > > I am trying to plot a scatter plot between temperature and heart rate and > additionally marking the outcome of the patients by colors. I am using the > standard package Use the standard function plot as well as the functions of > package "ggplot2" (Wickham (2009)). Save the plots in pdf files. > > I am geeting an error to plot when assigning colsOutcome to the > scatterplot. I am doing it wrongly. Please advise me. > ```{r} > library(ggplot2) > library(RColorBrewer) > library(ggsci) > ICUData <- read.csv(file = "ICUData.csv") > ``` > ```{r} > ## Generate empty vector > colsOutcome <- character(nrow(ICUData)) > ## Fill with colors > colsOutcome[ICUData$outcome == "died"] <- "#E41A1C" > colsOutcome[ICUData$outcome == "home"] <- "#377EB8" > colsOutcome[ICUData$outcome == "other hospital"] <- "#4DAF4A8" > colsOutcome[ICUData$outcome == "secondary care/rehab"] <- "#984EA3" > ``` > > ```{r} > plot(x = ICUData$temperature, y = ICUData$heart.rate, pch = 19, > xlab = "Maximum body temperature", ylab = "Maximum heart rate", > main = "500 ICU patients", col = colsOutcome, xlim = c(33,43)) > legend(x = "topleft", legend = c("died", "home", "other hospital", > "secondary care/rehab"), pch = 19, >col = c("#E41A1C", "#377EB8", "#4DAF4A8", "#984EA3")) > ``` > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- John Kane Kingston ON Canada [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Amazing AI
Does not Medians <- apply(numeric_data, 1, median) give us the rom medians? On Mon, 19 Dec 2022 at 05:52, Milan Glacier wrote: > On 12/18/22 19:01, Boris Steipe wrote: > >Technically not a help question. But crucial to be aware of, especially > for those of us in academia, or otherwise teaching R. I am not aware of a > suitable alternate forum. If this does not interest you, please simply > ignore - I already know that this may be somewhat OT. > > > >Thanks. > >-- > > > >You very likely have heard of ChatGPT, the conversation interface on top > of the GPT-3 large language model and that it can generate code. I thought > it doesn't do R - I was wrong. Here is a little experiment: > >Note that the strategy is quite different (e.g using %in%, not > duplicated() ), the interpretation of "last variable" is technically > correct but not what I had in mind (ChatGPT got that right though). > > > > > >Changing my prompts slightly resulted it going for a dplyr solution > instead, complete with %>% idioms etc ... again, syntactically correct but > not giving me the fully correct results. > > > >-- > > > >Bottom line: The AI's ability to translate natural language instructions > into code is astounding. Errors the AI makes are subtle and probably not > easy to fix if you don't already know what you are doing. But the way that > this can be "confidently incorrect" and plausible makes it nearly > impossible to detect unless you actually run the code (you may have noticed > that when you read the code). > > > >Will our students use it? Absolutely. > > > >Will they successfully cheat with it? That depends on the assignment. We > probably need to _encourage_ them to use it rather than sanction - but > require them to attribute the AI, document prompts, and identify their own, > additional contributions. > > > >Will it help them learn? When you are aware of the issues, it may be > quite useful. It may be especially useful to teach them to specify their > code carefully and completely, and to ask questions in the right way. Test > cases are crucial. > > > >How will it affect what we do as instructors? I don't know. Really. > > > >And the future? I am not pleased to extrapolate to a job market in which > they compete with knowledge workers who work 24/7 without benefits, > vacation pay, or even a salary. They'll need to rethink the value of their > investment in an academic education. We'll need to rethink what we do to > provide value above and beyond what AI's can do. (Nb. all of the arguments > I hear about why humans will always be better etc. are easily debunked, but > that's even more OT :-) > > > > > > > >If you have thoughts to share how your institution is thinking about > academic integrity in this situation, or creative ideas how to integrate > this into teaching, I'd love to hear from you. > > *NEVER* let the AI misleading the students! ChatGPT gives you seemingly > sound but actually *wrong* code! > > ChatGPT never understands the formal abstraction behind the code, it > just understands the shallow text pattern (and the syntax rules) in the > code. And it often gives you the code that seemingly correct but indeed > wrongly output. If it is used with code completion, then it is okay > (just like github copilot), since the coder need to modify the code > after getting the completion. But if you want to use ChatGPT for > students to query information / writing code, it is error proning! > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- John Kane Kingston ON Canada [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Plot a line using ggplot2
Colleagues, I am trying to plot a simple line using ggplot2. I get the axes, but I don't get the line. Please let me know what my error I am making. Thank you, John # Define x and y values PointEstx <- Estx+1.96*SE PointEsty <- 1 row2 <- cbind(PointEstx,PointEsty) linedata<- data_frame(rbind(row1,row2)) linedata # make sure we have a data frame class(linedata) #plot the data ggplot(linedata,aes(x=PointEstx, y=PointEsty), geom_line()) __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rio: list of extensions for supported formats
Cat was being helpful. On Sat, 5 Nov 2022 at 15:39, John Kane wrote: > o idea but there is a list here > https://thomasleeper.com/rio/articles/rio.html > > On Sat, 5 Nov 2022 at 04:04, Sigbert Klinke > wrote: > >> Hi, >> >> is there a function in the package rio to get the file extensions listed >> in the vignette under supported formats? >> >> Thanks Sigbert >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > -- > John Kane > Kingston ON Canada > -- John Kane Kingston ON Canada [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rio: list of extensions for supported formats
o idea but there is a list here https://thomasleeper.com/rio/articles/rio.html On Sat, 5 Nov 2022 at 04:04, Sigbert Klinke wrote: > Hi, > > is there a function in the package rio to get the file extensions listed > in the vignette under supported formats? > > Thanks Sigbert > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- John Kane Kingston ON Canada [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Single pdf of all R vignettes request
Dear All, I am writing to ask whether there exists a single pdf of all the vignettes from R packages. This would be good resource. Best regards, John __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] startup loading issue
On Tue, 25 Oct 2022 08:33:10 -0500 ken eagle wrote: > I thought I was loading a ~300M binary (bigwig) file into another > application . . . Is the other application R dependent, written in R, or call R capacities? If it doesn't, the issue might be with the "other application rather than R. JWDougherty __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] unexpected 'else' in " else"
Dear Jinsong, When you enter these code lines at the R command prompt, the interpreter evaluates an expression when it's syntactically complete, which occurs before it sees the else clause. The interpreter can't read your mind and know that an else clause will be entered on the next line. When the code lines are in a function, the function body is enclosed in braces and so the interpreter sees the else clause. As I believe was already pointed out, you can similarly use braces at the command prompt to signal incompleteness of an expression, as in > {if (FALSE) print(1) + else print(2)} [1] 2 I hope this helps, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 2022-10-21 8:06 a.m., Jinsong Zhao wrote: Thanks a lot! I know the first and third way to correct the error. The second way seems make me know why the code is correct in the function stats::weighted.residuals. On 2022/10/21 17:36, Andrew Simmons wrote: The error comes from the expression not being wrapped with braces. You could change it to if (is.matrix(r)) { r[w != 0, , drop = FALSE] } else r[w != 0] or { if (is.matrix(r)) r[w != 0, , drop = FALSE] else r[w != 0] } or if (is.matrix(r)) r[w != 0, , drop = FALSE] else r[w != 0] On Fri., Oct. 21, 2022, 05:29 Jinsong Zhao, wrote: Hi there, The following code would cause R error: > w <- 1:5 > r <- 1:5 > if (is.matrix(r)) + r[w != 0, , drop = FALSE] > else r[w != 0] Error: unexpected 'else' in " else" However, the code: if (is.matrix(r)) r[w != 0, , drop = FALSE] else r[w != 0] is extracted from stats::weighted.residuals. My question is why the code in the function does not cause error? Best, Jinsong __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html <http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] prcomp - arbitrary direction of the returned principal components
This reminds me of a situation in 1975 where a large computer service bureau had contracted to migrate scientific software from a Univac 1108 to a an IBM System 360. They spent 3 weeks trying to get the IBM to give the same eigenvectors on a problem as the Univac. There were at least 2 eigenvalues that were equal. They were trying to fix something that was not broken. Their desperation was enough to offer me a very large fee to "fix" things. However, I had a nice job, so told them to go away and read a couple of books on the real-symmetric eigenvalue problem or singular value decomposition, though the latter was just becoming known outside of numerical linear algebra. I suspect the OP should go back to basics with principal components and not try to fiddle with the output. It is likely that the "loadings" (I'm never sure of the nomenclature -- I use the matrix setup) can be rotated, but you can't just rotate one vector of a set on its own. Amazing how these old issues linger for decades. Or maybe linear algebra is not on the curriculum. John Nash On Thu, 2022-10-13 at 19:35 +0530, Ashim Kapoor wrote: > Dear All, > > Many thanks for your replies. > > My PC1 loading turns out to be : > > 1/sqrt(2) , -1/sqrt(2) > > In simple words : I had 2 variables and I ran prcomp on them. I got my > PC1 as : .7071068 var1 - .7071068 var2 > > PC2 turned out to be the same as PC1 with a PLUS replacing the minus, > ie. .7071068 var1 + .7071068 var2 > > But forget PC2 for the time being. > > Now my question is : I am not able to use the rule that : choose the > variable with a bigger magnitude of loading and multiply PC1 by -1 if > needed (to flip the PC1 since any vector x and it's flipped version -x > are the same vector but with opposite direction) if the variable with > bigger magnitude is of negative sign. > > I have an alternative measure of stress which is trending UP and has 2 > peaks during 2 recessions and I can see that PC1 is trending DOWN and > has 2 TROUGHS during the same recessions. That's how I wish to FLIP > PC1 with a negative sign. > > The data is not mine and I am not at liberty to share it. I can > construct an artificial example but I would need time to do that. > > That's what's happening. > > Best Regards and > Many thanks. > Ashim > > > > > > > > > > On Thu, Oct 13, 2022 at 5:38 PM Ebert,Timothy Aaron wrote: > > > > I still do not understand. However, the general approach would be to > > identify a > > specific value to test. If the test is TRUE then do "this" otherwise do > > nothing. Once > > the test condition is properly identified, the coding easily follows. > > > > abs() is the same as > > if x<0 then x = -x (non-R code, just idea) > > The R code might look something more like > > for (number in 1:ncol(x)){ > > if (x[3,2] < 0) { > > x[number, number] = -x[number, number] #only change the diagonal > > } > > } > > > > Depending on what values need to be changed you may need a nested for loop > > to go > > through all values of x[number1, number2]. > > > > Your words: " I can forcefully use a NEGATIVE sign to FLIP the index when > > it is LOW." > > Where it appeared that "low" was defined as values that are negative. You > > still will > > have low values (close to zero) and high values (far from zero). > > > > You could make the condition some other value: > > > > if x< -4 then x = -x > > > > If you just want to rotate about zero then > > x = -x > > In this case the positive values will become negative and the negative > > values > > positive. > > Add an if test to selectively rotate based on the value of a single test > > element in x > > (as in x[3,2]). > > > > In debugging or trouble shooting setting seed is useful. For actual data > > analysis you > > should not set seed, or possibly better yet use set.seed(NULL). > > > > Tim > > > > > > > > -Original Message- > > From: Ashim Kapoor > > Sent: Thursday, October 13, 2022 12:28 AM > > To: Ebert,Timothy Aaron > > Cc: R Help > > Subject: Re: [R] prcomp - arbitrary direction of the returned principal > > components > > > > [External Email] > > > > Dear Aaron, > > > > Many thanks for your reply. > > > > Please allow me to illustrate my query a bit. > > > > I take some data, throw it to prcomp and extract the x data frame from > > prcomp. > > &
Re: [R] How long does it take to learn the R programming language?
+ 1 On Wed, 28 Sept 2022 at 17:36, Jim Lemon wrote: > Given some of the questions that are posted to this list, I am not > sure that there is an upper bound to the estimate. > > Jim > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- John Kane Kingston ON Canada [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.