Re: [R] substitute column data frame based on name stored in variable in r
Got it, thank you! On Tue, 10 Aug 2021, 00:12 David Winsemius, wrote: > > On 8/9/21 12:22 PM, Luigi Marongiu wrote: > > Thank you! it worked fine! The only pitfall is that `NA` became > > ``. This is essentially the same thing anyway... > > > It's not "essentially the same thing". It IS the same thing. The print > function displays those '<>' characters flanking NA's when the class is > factor. Type this at your console: > > > factor(NA) > > > -- > > David > > > > > On Mon, Aug 9, 2021 at 5:18 PM Ivan Krylov > wrote: > >> Thanks for providing a reproducible example! > >> > >> On Mon, 9 Aug 2021 15:33:53 +0200 > >> Luigi Marongiu wrote: > >> > >>> df[df[['vect[2]']] == 2, 'vect[2]'] <- "No" > >> Please don't quote R expressions that you want to evaluate. 'vect[2]' > >> is just a string, like 'hello world' or 'I want to create a new column > >> named "vect[2]" instead of accessing the second one'. > >> > >>> Error in `[<-.data.frame`(`*tmp*`, df[[vect[2]]] == 2, vect[2], value > >>> = "No") : missing values are not allowed in subscripted assignments > >>> of data frames > >> Since df[[2]] containts NAs, comparisons with it also contain NAs. While > >> it's possible to subset data.frames with NAs (the rows corresponding to > >> the NAs are returned filled with NAs of corresponding types), > >> assignment to undefined rows is not allowed. A simple way to remove the > >> NAs and only leave the cases where df[[vect[2]]] == 2 is TRUE would be > >> to use which(). Compare: > >> > >> df[df[[vect[2]]] == 2,] > >> df[which(df[[vect[2]]] == 2),] > >> > >> -- > >> Best regards, > >> Ivan > > > > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] substitute column data frame based on name stored in variable in r
On 8/9/21 12:22 PM, Luigi Marongiu wrote: Thank you! it worked fine! The only pitfall is that `NA` became ``. This is essentially the same thing anyway... It's not "essentially the same thing". It IS the same thing. The print function displays those '<>' characters flanking NA's when the class is factor. Type this at your console: factor(NA) -- David On Mon, Aug 9, 2021 at 5:18 PM Ivan Krylov wrote: Thanks for providing a reproducible example! On Mon, 9 Aug 2021 15:33:53 +0200 Luigi Marongiu wrote: df[df[['vect[2]']] == 2, 'vect[2]'] <- "No" Please don't quote R expressions that you want to evaluate. 'vect[2]' is just a string, like 'hello world' or 'I want to create a new column named "vect[2]" instead of accessing the second one'. Error in `[<-.data.frame`(`*tmp*`, df[[vect[2]]] == 2, vect[2], value = "No") : missing values are not allowed in subscripted assignments of data frames Since df[[2]] containts NAs, comparisons with it also contain NAs. While it's possible to subset data.frames with NAs (the rows corresponding to the NAs are returned filled with NAs of corresponding types), assignment to undefined rows is not allowed. A simple way to remove the NAs and only leave the cases where df[[vect[2]]] == 2 is TRUE would be to use which(). Compare: df[df[[vect[2]]] == 2,] df[which(df[[vect[2]]] == 2),] -- Best regards, Ivan __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] substitute column data frame based on name stored in variable in r
Thank you! it worked fine! The only pitfall is that `NA` became ``. This is essentially the same thing anyway... On Mon, Aug 9, 2021 at 5:18 PM Ivan Krylov wrote: > > Thanks for providing a reproducible example! > > On Mon, 9 Aug 2021 15:33:53 +0200 > Luigi Marongiu wrote: > > > df[df[['vect[2]']] == 2, 'vect[2]'] <- "No" > > Please don't quote R expressions that you want to evaluate. 'vect[2]' > is just a string, like 'hello world' or 'I want to create a new column > named "vect[2]" instead of accessing the second one'. > > > Error in `[<-.data.frame`(`*tmp*`, df[[vect[2]]] == 2, vect[2], value > > = "No") : missing values are not allowed in subscripted assignments > > of data frames > > Since df[[2]] containts NAs, comparisons with it also contain NAs. While > it's possible to subset data.frames with NAs (the rows corresponding to > the NAs are returned filled with NAs of corresponding types), > assignment to undefined rows is not allowed. A simple way to remove the > NAs and only leave the cases where df[[vect[2]]] == 2 is TRUE would be > to use which(). Compare: > > df[df[[vect[2]]] == 2,] > df[which(df[[vect[2]]] == 2),] > > -- > Best regards, > Ivan -- Best regards, Luigi __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] substitute column data frame based on name stored in variable in r
Thanks for providing a reproducible example! On Mon, 9 Aug 2021 15:33:53 +0200 Luigi Marongiu wrote: > df[df[['vect[2]']] == 2, 'vect[2]'] <- "No" Please don't quote R expressions that you want to evaluate. 'vect[2]' is just a string, like 'hello world' or 'I want to create a new column named "vect[2]" instead of accessing the second one'. > Error in `[<-.data.frame`(`*tmp*`, df[[vect[2]]] == 2, vect[2], value > = "No") : missing values are not allowed in subscripted assignments > of data frames Since df[[2]] containts NAs, comparisons with it also contain NAs. While it's possible to subset data.frames with NAs (the rows corresponding to the NAs are returned filled with NAs of corresponding types), assignment to undefined rows is not allowed. A simple way to remove the NAs and only leave the cases where df[[vect[2]]] == 2 is TRUE would be to use which(). Compare: df[df[[vect[2]]] == 2,] df[which(df[[vect[2]]] == 2),] -- Best regards, Ivan __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] substitute column data frame based on name stored in variable in r
You are right, vect will contain the names of the columns of the real dataframe buyt the actual simulation of the real case is more like this: ``` > df = data.frame(A = 1:5, B = c(1, 2, NA, 2, NA), C = c("value is blue", > "Value is red", "empty", " value is blue", " Value is green"), D = 9:13, E = > c("light", "light", "heavy", "heavy", "heavy")); df A B C D E 1 1 1 value is blue 9 light 2 2 2Value is red 10 light 3 3 NA empty 11 heavy 4 4 2 value is blue 12 heavy 5 5 NA Value is green 13 heavy > vect = LETTERS[1:5] > df[df[['vect[2]']] == 2, 'vect[2]'] <- "No"; df A B C D E vect[2] 1 1 1 value is blue 9 light 2 2 2Value is red 10 light 3 3 NA empty 11 heavy 4 4 2 value is blue 12 heavy 5 5 NA Value is green 13 heavy > df[df[[vect[2]]] == 2, vect[2]] <- "No"; df Error in `[<-.data.frame`(`*tmp*`, df[[vect[2]]] == 2, vect[2], value = "No") : missing values are not allowed in subscripted assignments of data frames ``` but still, I get an extra column instead of working on column B directly. and I can't dispense the quotation marks... On Mon, Aug 9, 2021 at 1:31 PM Ivan Krylov wrote: > > On Mon, 9 Aug 2021 13:16:02 +0200 > Luigi Marongiu wrote: > > > df = data.frame(VAR = ..., VAL = ...) > > vect = letters[1:5] > > What is the relation between vect and the column names of the data > frame? Is it your intention to choose rows or columns using `vect`? > > > df[df[['vect[2]']] == 2, 'vect[2]'] > > '...' creates a string literal. If you want to evaluate an R > expression, don't wrap it in quotes. > > I had assumed you wanted to put column names in the vector `vect`, but > now I'm just confused: `vect` is the same as df$VAR, not colnames(df). > What do you want to achieve? > > Again, you can access the second column with much less typing by > addressing it directly: df[[2]] > > Does it help if you consult [**] or some other tutorial on subsetting > in R? > > -- > Best regards, > Ivan > > [**] > https://cran.r-project.org/doc/manuals/r-release/R-intro.html#Index-vectors > https://cran.r-project.org/doc/manuals/r-release/R-intro.html#Lists -- Best regards, Luigi __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] substitute column data frame based on name stored in variable in r
On Mon, 9 Aug 2021 13:16:02 +0200 Luigi Marongiu wrote: > df = data.frame(VAR = ..., VAL = ...) > vect = letters[1:5] What is the relation between vect and the column names of the data frame? Is it your intention to choose rows or columns using `vect`? > df[df[['vect[2]']] == 2, 'vect[2]'] '...' creates a string literal. If you want to evaluate an R expression, don't wrap it in quotes. I had assumed you wanted to put column names in the vector `vect`, but now I'm just confused: `vect` is the same as df$VAR, not colnames(df). What do you want to achieve? Again, you can access the second column with much less typing by addressing it directly: df[[2]] Does it help if you consult [**] or some other tutorial on subsetting in R? -- Best regards, Ivan [**] https://cran.r-project.org/doc/manuals/r-release/R-intro.html#Index-vectors https://cran.r-project.org/doc/manuals/r-release/R-intro.html#Lists __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] substitute column data frame based on name stored in variable in r
Thank you but I think I got it wrong: ``` > df = data.frame(VAR = letters[1:5], VAL = c(1, 2, NA, 2, NA)); df VAR VAL 1 a 1 2 b 2 3 c NA 4 d 2 5 e NA > vect = letters[1:5] > df[df[['vect[2]']] == 2, 'vect[2]'] <- "No"; df VAR VAL vect[2] 1 a 1 2 b 2 3 c NA 4 d 2 5 e NA ``` On Mon, Aug 9, 2021 at 11:25 AM Ivan Krylov wrote: > > On Mon, 9 Aug 2021 10:26:03 +0200 > Luigi Marongiu wrote: > > > vect = names(df) > > sub_df[vect[1]] > > > df$column[df$column == value] <- new.value > > Let's see, an equivalent expression without the $ syntax is > `df[['column']][df[['column']] == value] <- new.value`. Slightly > shorter, matrix-like syntax would give us > `df[df[['column']] == value, 'column'] <- new.value`. > > Now replace 'column' with vect[i] and you're done. The `[[`-indexing is > used here to get the column contents instead of a single-column > data.frame that `[`-indexing returns for lists. > > Also note that df[[names(df)[i]]] should be the same as df[[i]] for > most data.frames. > > -- > Best regards, > Ivan -- Best regards, Luigi __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] substitute column data frame based on name stored in variable in r
On Mon, 9 Aug 2021 10:26:03 +0200 Luigi Marongiu wrote: > vect = names(df) > sub_df[vect[1]] > df$column[df$column == value] <- new.value Let's see, an equivalent expression without the $ syntax is `df[['column']][df[['column']] == value] <- new.value`. Slightly shorter, matrix-like syntax would give us `df[df[['column']] == value, 'column'] <- new.value`. Now replace 'column' with vect[i] and you're done. The `[[`-indexing is used here to get the column contents instead of a single-column data.frame that `[`-indexing returns for lists. Also note that df[[names(df)[i]]] should be the same as df[[i]] for most data.frames. -- Best regards, Ivan __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] substitute column data frame based on name stored in variable in r
Thank you very much, but that would make even more work due to the duplication... On Mon, Aug 9, 2021 at 10:53 AM Jim Lemon wrote: > > Hi Luigi, > It looks to me as though you will have to copy the data frame or store > the output in a new data frame. > > Jim > > On Mon, Aug 9, 2021 at 6:26 PM Luigi Marongiu > wrote: > > > > Hello, > > I would like to recursively select the columns of a dataframe by > > strong the names of the dataframe in a vector and extracting one > > element of the vector at a time. This I can do with, for instance: > > ``` > > vect = names(df) > > sub_df[vect[1]] > > ``` > > > > The problem is that I would like also to change the values of the > > selected column using some logic as in `df$column[df$column == value] > > <- new.value`, but I am confused on the syntax for the vectorized > > version. Specifically, this does not work: > > ``` > > sub_df[vect[1] == 0] = "No" > > ``` > > What would be the correct approach? > > Thank you > > > > __ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. -- Best regards, Luigi __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] substitute column data frame based on name stored in variable in r
Hi Luigi, It looks to me as though you will have to copy the data frame or store the output in a new data frame. Jim On Mon, Aug 9, 2021 at 6:26 PM Luigi Marongiu wrote: > > Hello, > I would like to recursively select the columns of a dataframe by > strong the names of the dataframe in a vector and extracting one > element of the vector at a time. This I can do with, for instance: > ``` > vect = names(df) > sub_df[vect[1]] > ``` > > The problem is that I would like also to change the values of the > selected column using some logic as in `df$column[df$column == value] > <- new.value`, but I am confused on the syntax for the vectorized > version. Specifically, this does not work: > ``` > sub_df[vect[1] == 0] = "No" > ``` > What would be the correct approach? > Thank you > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] substitute column data frame based on name stored in variable in r
Hello, I would like to recursively select the columns of a dataframe by strong the names of the dataframe in a vector and extracting one element of the vector at a time. This I can do with, for instance: ``` vect = names(df) sub_df[vect[1]] ``` The problem is that I would like also to change the values of the selected column using some logic as in `df$column[df$column == value] <- new.value`, but I am confused on the syntax for the vectorized version. Specifically, this does not work: ``` sub_df[vect[1] == 0] = "No" ``` What would be the correct approach? Thank you __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.