Re: [R] substitute column data frame based on name stored in variable in r

2021-08-09 Thread Luigi Marongiu
Got it, thank you!

On Tue, 10 Aug 2021, 00:12 David Winsemius,  wrote:

>
> On 8/9/21 12:22 PM, Luigi Marongiu wrote:
> > Thank you! it worked fine! The only pitfall is that `NA` became
> > ``. This is essentially the same thing anyway...
>
>
> It's not "essentially the same thing". It IS the same thing. The print
> function displays those '<>' characters flanking NA's when the class is
> factor. Type this at your console:
>
>
> factor(NA)
>
>
> --
>
> David
>
> >
> > On Mon, Aug 9, 2021 at 5:18 PM Ivan Krylov 
> wrote:
> >> Thanks for providing a reproducible example!
> >>
> >> On Mon, 9 Aug 2021 15:33:53 +0200
> >> Luigi Marongiu  wrote:
> >>
> >>> df[df[['vect[2]']] == 2, 'vect[2]'] <- "No"
> >> Please don't quote R expressions that you want to evaluate. 'vect[2]'
> >> is just a string, like 'hello world' or 'I want to create a new column
> >> named "vect[2]" instead of accessing the second one'.
> >>
> >>> Error in `[<-.data.frame`(`*tmp*`, df[[vect[2]]] == 2, vect[2], value
> >>> = "No") : missing values are not allowed in subscripted assignments
> >>> of data frames
> >> Since df[[2]] containts NAs, comparisons with it also contain NAs. While
> >> it's possible to subset data.frames with NAs (the rows corresponding to
> >> the NAs are returned filled with NAs of corresponding types),
> >> assignment to undefined rows is not allowed. A simple way to remove the
> >> NAs and only leave the cases where df[[vect[2]]] == 2 is TRUE would be
> >> to use which(). Compare:
> >>
> >> df[df[[vect[2]]] == 2,]
> >> df[which(df[[vect[2]]] == 2),]
> >>
> >> --
> >> Best regards,
> >> Ivan
> >
> >
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] substitute column data frame based on name stored in variable in r

2021-08-09 Thread David Winsemius



On 8/9/21 12:22 PM, Luigi Marongiu wrote:

Thank you! it worked fine! The only pitfall is that `NA` became
``. This is essentially the same thing anyway...



It's not "essentially the same thing". It IS the same thing. The print 
function displays those '<>' characters flanking NA's when the class is 
factor. Type this at your console:



factor(NA)


--

David



On Mon, Aug 9, 2021 at 5:18 PM Ivan Krylov  wrote:

Thanks for providing a reproducible example!

On Mon, 9 Aug 2021 15:33:53 +0200
Luigi Marongiu  wrote:


df[df[['vect[2]']] == 2, 'vect[2]'] <- "No"

Please don't quote R expressions that you want to evaluate. 'vect[2]'
is just a string, like 'hello world' or 'I want to create a new column
named "vect[2]" instead of accessing the second one'.


Error in `[<-.data.frame`(`*tmp*`, df[[vect[2]]] == 2, vect[2], value
= "No") : missing values are not allowed in subscripted assignments
of data frames

Since df[[2]] containts NAs, comparisons with it also contain NAs. While
it's possible to subset data.frames with NAs (the rows corresponding to
the NAs are returned filled with NAs of corresponding types),
assignment to undefined rows is not allowed. A simple way to remove the
NAs and only leave the cases where df[[vect[2]]] == 2 is TRUE would be
to use which(). Compare:

df[df[[vect[2]]] == 2,]
df[which(df[[vect[2]]] == 2),]

--
Best regards,
Ivan





__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] substitute column data frame based on name stored in variable in r

2021-08-09 Thread Luigi Marongiu
Thank you! it worked fine! The only pitfall is that `NA` became
``. This is essentially the same thing anyway...

On Mon, Aug 9, 2021 at 5:18 PM Ivan Krylov  wrote:
>
> Thanks for providing a reproducible example!
>
> On Mon, 9 Aug 2021 15:33:53 +0200
> Luigi Marongiu  wrote:
>
> > df[df[['vect[2]']] == 2, 'vect[2]'] <- "No"
>
> Please don't quote R expressions that you want to evaluate. 'vect[2]'
> is just a string, like 'hello world' or 'I want to create a new column
> named "vect[2]" instead of accessing the second one'.
>
> > Error in `[<-.data.frame`(`*tmp*`, df[[vect[2]]] == 2, vect[2], value
> > = "No") : missing values are not allowed in subscripted assignments
> > of data frames
>
> Since df[[2]] containts NAs, comparisons with it also contain NAs. While
> it's possible to subset data.frames with NAs (the rows corresponding to
> the NAs are returned filled with NAs of corresponding types),
> assignment to undefined rows is not allowed. A simple way to remove the
> NAs and only leave the cases where df[[vect[2]]] == 2 is TRUE would be
> to use which(). Compare:
>
> df[df[[vect[2]]] == 2,]
> df[which(df[[vect[2]]] == 2),]
>
> --
> Best regards,
> Ivan



-- 
Best regards,
Luigi

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] substitute column data frame based on name stored in variable in r

2021-08-09 Thread Ivan Krylov
Thanks for providing a reproducible example!

On Mon, 9 Aug 2021 15:33:53 +0200
Luigi Marongiu  wrote:

> df[df[['vect[2]']] == 2, 'vect[2]'] <- "No"

Please don't quote R expressions that you want to evaluate. 'vect[2]'
is just a string, like 'hello world' or 'I want to create a new column
named "vect[2]" instead of accessing the second one'.

> Error in `[<-.data.frame`(`*tmp*`, df[[vect[2]]] == 2, vect[2], value
> = "No") : missing values are not allowed in subscripted assignments
> of data frames

Since df[[2]] containts NAs, comparisons with it also contain NAs. While
it's possible to subset data.frames with NAs (the rows corresponding to
the NAs are returned filled with NAs of corresponding types),
assignment to undefined rows is not allowed. A simple way to remove the
NAs and only leave the cases where df[[vect[2]]] == 2 is TRUE would be
to use which(). Compare:

df[df[[vect[2]]] == 2,]
df[which(df[[vect[2]]] == 2),]

-- 
Best regards,
Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] substitute column data frame based on name stored in variable in r

2021-08-09 Thread Luigi Marongiu
You are right, vect will contain the names of the columns of the real
dataframe buyt the actual simulation of the real case is more like
this:
```
> df = data.frame(A = 1:5, B = c(1, 2, NA, 2, NA), C = c("value is blue", 
> "Value is red", "empty", "  value is blue", " Value is green"), D = 9:13, E = 
> c("light", "light", "heavy", "heavy", "heavy")); df
  A  B   C  D E
1 1  1   value is blue  9 light
2 2  2Value is red 10 light
3 3 NA   empty 11 heavy
4 4  2   value is blue 12 heavy
5 5 NA  Value is green 13 heavy
> vect = LETTERS[1:5]
> df[df[['vect[2]']] == 2, 'vect[2]'] <- "No"; df
  A  B   C  D E vect[2]
1 1  1   value is blue  9 light
2 2  2Value is red 10 light
3 3 NA   empty 11 heavy
4 4  2   value is blue 12 heavy
5 5 NA  Value is green 13 heavy
> df[df[[vect[2]]] == 2, vect[2]] <- "No"; df
Error in `[<-.data.frame`(`*tmp*`, df[[vect[2]]] == 2, vect[2], value = "No") :
  missing values are not allowed in subscripted assignments of data frames
```
but still, I get an extra column instead of working on column B
directly. and I can't dispense the quotation marks...

On Mon, Aug 9, 2021 at 1:31 PM Ivan Krylov  wrote:
>
> On Mon, 9 Aug 2021 13:16:02 +0200
> Luigi Marongiu  wrote:
>
> > df = data.frame(VAR = ..., VAL = ...)
> > vect = letters[1:5]
>
> What is the relation between vect and the column names of the data
> frame? Is it your intention to choose rows or columns using `vect`?
>
> > df[df[['vect[2]']] == 2, 'vect[2]']
>
> '...' creates a string literal. If you want to evaluate an R
> expression, don't wrap it in quotes.
>
> I had assumed you wanted to put column names in the vector `vect`, but
> now I'm just confused: `vect` is the same as df$VAR, not colnames(df).
> What do you want to achieve?
>
> Again, you can access the second column with much less typing by
> addressing it directly: df[[2]]
>
> Does it help if you consult [**] or some other tutorial on subsetting
> in R?
>
> --
> Best regards,
> Ivan
>
> [**]
> https://cran.r-project.org/doc/manuals/r-release/R-intro.html#Index-vectors
> https://cran.r-project.org/doc/manuals/r-release/R-intro.html#Lists



-- 
Best regards,
Luigi

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] substitute column data frame based on name stored in variable in r

2021-08-09 Thread Ivan Krylov
On Mon, 9 Aug 2021 13:16:02 +0200
Luigi Marongiu  wrote:

> df = data.frame(VAR = ..., VAL = ...)
> vect = letters[1:5]

What is the relation between vect and the column names of the data
frame? Is it your intention to choose rows or columns using `vect`?

> df[df[['vect[2]']] == 2, 'vect[2]']

'...' creates a string literal. If you want to evaluate an R
expression, don't wrap it in quotes.

I had assumed you wanted to put column names in the vector `vect`, but
now I'm just confused: `vect` is the same as df$VAR, not colnames(df).
What do you want to achieve?

Again, you can access the second column with much less typing by
addressing it directly: df[[2]]

Does it help if you consult [**] or some other tutorial on subsetting
in R?

-- 
Best regards,
Ivan

[**] 
https://cran.r-project.org/doc/manuals/r-release/R-intro.html#Index-vectors
https://cran.r-project.org/doc/manuals/r-release/R-intro.html#Lists

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] substitute column data frame based on name stored in variable in r

2021-08-09 Thread Luigi Marongiu
Thank you but I think I got it wrong:
```
> df = data.frame(VAR = letters[1:5], VAL = c(1, 2, NA, 2, NA)); df
  VAR VAL
1   a   1
2   b   2
3   c  NA
4   d   2
5   e  NA
> vect = letters[1:5]
> df[df[['vect[2]']] == 2, 'vect[2]'] <- "No"; df
  VAR VAL vect[2]
1   a   1
2   b   2
3   c  NA
4   d   2
5   e  NA
```

On Mon, Aug 9, 2021 at 11:25 AM Ivan Krylov  wrote:
>
> On Mon, 9 Aug 2021 10:26:03 +0200
> Luigi Marongiu  wrote:
>
> > vect = names(df)
> > sub_df[vect[1]]
>
> > df$column[df$column == value] <- new.value
>
> Let's see, an equivalent expression without the $ syntax is
> `df[['column']][df[['column']] == value] <- new.value`. Slightly
> shorter, matrix-like syntax would give us
> `df[df[['column']] == value, 'column'] <- new.value`.
>
> Now replace 'column' with vect[i] and you're done. The `[[`-indexing is
> used here to get the column contents instead of a single-column
> data.frame that `[`-indexing returns for lists.
>
> Also note that df[[names(df)[i]]] should be the same as df[[i]] for
> most data.frames.
>
> --
> Best regards,
> Ivan



-- 
Best regards,
Luigi

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] substitute column data frame based on name stored in variable in r

2021-08-09 Thread Ivan Krylov
On Mon, 9 Aug 2021 10:26:03 +0200
Luigi Marongiu  wrote:

> vect = names(df)
> sub_df[vect[1]]

> df$column[df$column == value] <- new.value

Let's see, an equivalent expression without the $ syntax is
`df[['column']][df[['column']] == value] <- new.value`. Slightly
shorter, matrix-like syntax would give us
`df[df[['column']] == value, 'column'] <- new.value`.

Now replace 'column' with vect[i] and you're done. The `[[`-indexing is
used here to get the column contents instead of a single-column
data.frame that `[`-indexing returns for lists.

Also note that df[[names(df)[i]]] should be the same as df[[i]] for
most data.frames.

-- 
Best regards,
Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] substitute column data frame based on name stored in variable in r

2021-08-09 Thread Luigi Marongiu
Thank you very much, but that would make even more work due to the
duplication...

On Mon, Aug 9, 2021 at 10:53 AM Jim Lemon  wrote:
>
> Hi Luigi,
> It looks to me as though you will have to copy the data frame or store
> the output in a new data frame.
>
> Jim
>
> On Mon, Aug 9, 2021 at 6:26 PM Luigi Marongiu  
> wrote:
> >
> > Hello,
> > I would like to recursively select the columns of a dataframe by
> > strong the names of the dataframe in a vector and extracting one
> > element of the vector at a time. This I can do with, for instance:
> > ```
> > vect = names(df)
> > sub_df[vect[1]]
> > ```
> >
> > The problem is that I would like also to change the values of the
> > selected column using some logic as in `df$column[df$column == value]
> > <- new.value`, but I am confused on the syntax for the vectorized
> > version. Specifically, this does not work:
> > ```
> > sub_df[vect[1] == 0] = "No"
> > ```
> > What would be the correct approach?
> > Thank you
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.



-- 
Best regards,
Luigi

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] substitute column data frame based on name stored in variable in r

2021-08-09 Thread Jim Lemon
Hi Luigi,
It looks to me as though you will have to copy the data frame or store
the output in a new data frame.

Jim

On Mon, Aug 9, 2021 at 6:26 PM Luigi Marongiu  wrote:
>
> Hello,
> I would like to recursively select the columns of a dataframe by
> strong the names of the dataframe in a vector and extracting one
> element of the vector at a time. This I can do with, for instance:
> ```
> vect = names(df)
> sub_df[vect[1]]
> ```
>
> The problem is that I would like also to change the values of the
> selected column using some logic as in `df$column[df$column == value]
> <- new.value`, but I am confused on the syntax for the vectorized
> version. Specifically, this does not work:
> ```
> sub_df[vect[1] == 0] = "No"
> ```
> What would be the correct approach?
> Thank you
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] substitute column data frame based on name stored in variable in r

2021-08-09 Thread Luigi Marongiu
Hello,
I would like to recursively select the columns of a dataframe by
strong the names of the dataframe in a vector and extracting one
element of the vector at a time. This I can do with, for instance:
```
vect = names(df)
sub_df[vect[1]]
```

The problem is that I would like also to change the values of the
selected column using some logic as in `df$column[df$column == value]
<- new.value`, but I am confused on the syntax for the vectorized
version. Specifically, this does not work:
```
sub_df[vect[1] == 0] = "No"
```
What would be the correct approach?
Thank you

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.