Às 16:54 de 15/01/2023, Sorkin, John escreveu:
I am new to this thread. At the risk of presenting something that has been shown before, 
below I demonstrate how a column in a data frame can be dropped using a wild card, i.e. a 
column whose name starts with "th" using nothing more than base r functions and 
base R syntax. While additions to R such as tidyverse can be very helpful, many things 
that they do can be accomplished simply using base R.

# Create data frame with three columns
one <- rep(1,10)
one
two <- rep(2,10)
two
three <- rep(3,10)
three
mydata <- data.frame(one=one, two=two, three=three)
cat("Data frame with three columns\n")
mydata

# Drop the column whose name starts with th, i.e. column three
# Find the location of the column
ColumToDelete <- grep("th",colnames((mydata)))
cat("The colomumn to be dropped is the column called three, which is 
column",ColumToDelete,"\n")
ColumToDelete

# Drop the column whose name starts with "th"
newdata2 <- mydata[,-ColumnToDelete]
cat("Data frame after droping column whose name is three\n")
newdata2

I hope this helps.
John


________________________________________
From: R-help <r-help-boun...@r-project.org> on behalf of Valentin Petzel 
<valen...@petzel.at>
Sent: Saturday, January 14, 2023 1:21 PM
To: avi.e.gr...@gmail.com
Cc: 'R-help Mailing List'
Subject: Re: [R] Removing variables from data frame with a wile card

Hello Avi,

while something like d$something <- ... may seem like you're directly modifying 
the data it does not actually do so. Most R objects try to be immutable, that is, 
the object may not change after creation. This guarantees that if you have a 
binding for same object the object won't change sneakily.

There is a data structure that is in fact mutable which are environments. For 
example compare

L <- list()
local({L$a <- 3})
L$a

with

E <- new.env()
local({E$a <- 3})
E$a

The latter will in fact work, as the same Environment is modified, while in the 
first one a modified copy of the list is made.

Under the hood we have a parser trick: If R sees something like

f(a) <- ...

it will look for a function f<- and call

a <- f<-(a, ...)

(this also happens for example when you do names(x) <- ...)

So in fact in our case this is equivalent to creating a copy with removed 
columns and rebind the symbol in the current environment to the result.

The data.table package breaks with this convention and uses C based routines 
that allow changing of data without copying the object. Doing

d[, (cols_to_remove) := NULL]

will actually change the data.

Regards,
Valentin

14.01.2023 18:28:33 avi.e.gr...@gmail.com:

Steven,

Just want to add a few things to what people wrote.

In base R, the methods mentioned will let you make a copy of your original DF 
that is missing the items you are selecting that match your pattern.

That is fine.

For some purposes, you want to keep the original data.frame and remove a column 
within it. You can do that in several ways but the simplest is something where 
you sat the column to NULL as in:

mydata$NAME <- NULL

using the mydata["NAME"] notation can do that for you by using a loop of 
unctional programming method that does that with all components of your grep.

R does have optimizations that make this less useful as a partial copy of a 
data.frame retains common parts till things change.

For those who like to use the tidyverse, it comes with lots of tools that let 
you select columns that start with or end with or contain some pattern and I 
find that way easier.



-----Original Message-----
From: R-help <r-help-boun...@r-project.org> On Behalf Of Steven Yen
Sent: Saturday, January 14, 2023 7:49 AM
To: Andrew Simmons <akwsi...@gmail.com>
Cc: R-help Mailing List <r-help@r-project.org>
Subject: Re: [R] Removing variables from data frame with a wile card

Thanks to all. Very helpful.

Steven from iPhone

On Jan 14, 2023, at 3:08 PM, Andrew Simmons <akwsi...@gmail.com> wrote:

You'll want to use grep() or grepl(). By default, grep() uses
extended regular expressions to find matches, but you can also use
perl regular expressions and globbing (after converting to a regular 
expression).
For example:

grepl("^yr", colnames(mydata))

will tell you which 'colnames' start with "yr". If you'd rather you
use globbing:

grepl(glob2rx("yr*"), colnames(mydata))

Then you might write something like this to remove the columns starting with yr:

mydata <- mydata[, !grepl("^yr", colnames(mydata)), drop = FALSE]

On Sat, Jan 14, 2023 at 1:56 AM Steven T. Yen <st...@ntu.edu.tw> wrote:

I have a data frame containing variables "yr3",...,"yr28".

How do I remove them with a wild card----something similar to "del yr*"
in Windows/doc? Thank you.

colnames(mydata)
   [1] "year"       "weight"     "confeduc"   "confothr" "college"
   [6] ...
[41] "yr3"        "yr4"        "yr5"        "yr6" "yr7"
[46] "yr8"        "yr9"        "yr10"       "yr11" "yr12"
[51] "yr13"       "yr14"       "yr15"       "yr16" "yr17"
[56] "yr18"       "yr19"       "yr20"       "yr21" "yr22"
[61] "yr23"       "yr24"       "yr25"       "yr26" "yr27"
[66] "yr28"...

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7CJSorkin%40som.umaryland.edu%7Cca354e487c4e4b977f6b08daf6e2df29%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638093751546679426%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=GP9WF81MtvF%2FYi8LoWQt0W0VInk2WsPAgB0zHsu5aRQ%3D&reserved=0
PLEASE do read the posting guide
https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=05%7C01%7CJSorkin%40som.umaryland.edu%7Cca354e487c4e4b977f6b08daf6e2df29%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638093751546679426%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=h6SEOa8rBxjsq%2FQirtXACss4DdfseradQm9FFhDhbVw%3D&reserved=0
and provide commented, minimal, self-contained, reproducible code.

     [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7CJSorkin%40som.umaryland.edu%7Cca354e487c4e4b977f6b08daf6e2df29%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638093751546679426%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=GP9WF81MtvF%2FYi8LoWQt0W0VInk2WsPAgB0zHsu5aRQ%3D&reserved=0
PLEASE do read the posting guide 
https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=05%7C01%7CJSorkin%40som.umaryland.edu%7Cca354e487c4e4b977f6b08daf6e2df29%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638093751546679426%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=h6SEOa8rBxjsq%2FQirtXACss4DdfseradQm9FFhDhbVw%3D&reserved=0
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7CJSorkin%40som.umaryland.edu%7Cca354e487c4e4b977f6b08daf6e2df29%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638093751546679426%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=GP9WF81MtvF%2FYi8LoWQt0W0VInk2WsPAgB0zHsu5aRQ%3D&reserved=0
PLEASE do read the posting guide 
https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=05%7C01%7CJSorkin%40som.umaryland.edu%7Cca354e487c4e4b977f6b08daf6e2df29%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638093751546679426%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=h6SEOa8rBxjsq%2FQirtXACss4DdfseradQm9FFhDhbVw%3D&reserved=0
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7CJSorkin%40som.umaryland.edu%7Cca354e487c4e4b977f6b08daf6e2df29%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638093751546679426%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=GP9WF81MtvF%2FYi8LoWQt0W0VInk2WsPAgB0zHsu5aRQ%3D&reserved=0
PLEASE do read the posting guide 
https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=05%7C01%7CJSorkin%40som.umaryland.edu%7Cca354e487c4e4b977f6b08daf6e2df29%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638093751546679426%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=h6SEOa8rBxjsq%2FQirtXACss4DdfseradQm9FFhDhbVw%3D&reserved=0
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Hello,

Actually, Bill had addressed this in his post yesterday [1].
With your example,


one <- rep(1,10)
two <- rep(2,10)
three <- rep(3,10)
mydata <- data.frame(one=one, two=two, three=three)

ColumToDelete <- grep("fo",colnames((mydata)))
ColumToDelete
#> integer(0)
ColumToDeleteLogical <- grepl("fo",colnames((mydata)))
ColumToDeleteLogical
#> [1] FALSE FALSE FALSE

# Drop the column whose name starts with "fo"
# empty data.frame
mydata[, -ColumToDelete]
#> data frame with 0 columns and 10 rows

# nothing is deleted
mydata[, !ColumToDeleteLogical]
#>    one two three
#> 1    1   2     3
#> 2    1   2     3
#> 3    1   2     3
#> 4    1   2     3
#> 5    1   2     3
#> 6    1   2     3
#> 7    1   2     3
#> 8    1   2     3
#> 9    1   2     3
#> 10   1   2     3



[1] https://stat.ethz.ch/pipermail/r-help/2023-January/476682.html


Hope this helps,

Rui Barradas

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to