Re: [R] remove a row

2019-11-28 Thread Bert Gunter
Of course! Use regexec() and regmatches()

>
regmatches(dat$varx,regexec("(^[[:digit:]]{1,3})([[:alpha:]]{1,2})([[:digit:]]{1,5}$)",dat$varx))
[[1]]
[1] "9F209" "9" "F" "209"

[[2]]
character(0)

[[3]]
[1] "2F250" "2" "F" "250"

[[4]]
character(0)

[[5]]
character(0)

[[6]]
character(0)

[[7]]
character(0)

[[8]]
[1] "121FL50" "121" "FL"  "50"

The list components are character(0) for no match, otherwise a character
vector with the whole text entry first, then the 1st, 2nd, and 3rd strings
matching the 1st, 2nd, and 3rd parenthesized subexpressions of the pattern.
These correspond to area code, region code, and your 3rd numeric of course.
I leave it to you to extract what you want from this list, e.g via lapply().

For details, see the Help pages for the two functions.

-- Bert

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] remove a row

2019-11-28 Thread Ashta
Thank you so much Bert.

Is it possible to split the varx into  three ( area code, region and
the numeric part)as a separate variable

On Thu, Nov 28, 2019 at 7:31 PM Bert Gunter  wrote:
>
> Use regular expressions.
>
> See ?regexp  and ?grep
>
> Using your example:
>
> > grep("^[[:digit:]]{1,3}[[:alpha:]]{1,2}[[:digit:]]{1,5}$",dat$varx,value = 
> > TRUE)
> [1] "9F209"   "2F250"   "121FL50"
>
> Cheers,
> Bert
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and 
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Thu, Nov 28, 2019 at 3:17 PM Ashta  wrote:
>>
>> Hi all,  I want to remove a row based on a condition in one of the
>> variables from a data frame.
>> When we split this string it should be composed of 3-2- 5 format (3
>> digits numeric, 2 characters and 5 digits  numeric).  Like
>> area code -region-numeric. The max length of the area code should be
>> 3, the  max length of region be should be 2,  followed by a max length
>> of  5  numeric digits.  The are code  can  be 1 digit, or 2 digits or
>> 3 digits  but not more than three digits.  So  the  max length of this
>> variable is 10.  Anything outside of this pattern should be excluded.
>> As an example
>>
>> dat <-read.table(text=" rown  varx
>> 1   9F209
>> 2  FL250
>> 3  2F250
>> 4  102250
>> 5  102FL
>> 6   102
>> 7  1212FL250
>> 8  121FL50",header=TRUE,stringsAsFactors=F)
>>
>> 1  9F209   # keep
>> 2  FL250   # remove, no area code
>> 3   2F250  # keep
>> 4  102250 # remove , no region code
>> 5  102FL   # remove , no numeric after region code
>> 6   102  # remove ,  no region code and numeric
>> 7  1212FL250  #remove, area code is more than three digits
>> 8  121FL50  # Keep
>>
>> The desired output should be
>> 1   9F209
>> 3   2F250
>> 8  121FL50
>>
>> How do I do this in an efficient way?
>>
>> Thank you in advance
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] remove a row

2019-11-28 Thread Bert Gunter
Use regular expressions.

See ?regexp  and ?grep

Using your example:

> grep("^[[:digit:]]{1,3}[[:alpha:]]{1,2}[[:digit:]]{1,5}$",dat$varx,value
= TRUE)
[1] "9F209"   "2F250"   "121FL50"

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Thu, Nov 28, 2019 at 3:17 PM Ashta  wrote:

> Hi all,  I want to remove a row based on a condition in one of the
> variables from a data frame.
> When we split this string it should be composed of 3-2- 5 format (3
> digits numeric, 2 characters and 5 digits  numeric).  Like
> area code -region-numeric. The max length of the area code should be
> 3, the  max length of region be should be 2,  followed by a max length
> of  5  numeric digits.  The are code  can  be 1 digit, or 2 digits or
> 3 digits  but not more than three digits.  So  the  max length of this
> variable is 10.  Anything outside of this pattern should be excluded.
> As an example
>
> dat <-read.table(text=" rown  varx
> 1   9F209
> 2  FL250
> 3  2F250
> 4  102250
> 5  102FL
> 6   102
> 7  1212FL250
> 8  121FL50",header=TRUE,stringsAsFactors=F)
>
> 1  9F209   # keep
> 2  FL250   # remove, no area code
> 3   2F250  # keep
> 4  102250 # remove , no region code
> 5  102FL   # remove , no numeric after region code
> 6   102  # remove ,  no region code and numeric
> 7  1212FL250  #remove, area code is more than three digits
> 8  121FL50  # Keep
>
> The desired output should be
> 1   9F209
> 3   2F250
> 8  121FL50
>
> How do I do this in an efficient way?
>
> Thank you in advance
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] remove a row

2019-11-28 Thread Ashta
Hi all,  I want to remove a row based on a condition in one of the
variables from a data frame.
When we split this string it should be composed of 3-2- 5 format (3
digits numeric, 2 characters and 5 digits  numeric).  Like
area code -region-numeric. The max length of the area code should be
3, the  max length of region be should be 2,  followed by a max length
of  5  numeric digits.  The are code  can  be 1 digit, or 2 digits or
3 digits  but not more than three digits.  So  the  max length of this
variable is 10.  Anything outside of this pattern should be excluded.
As an example

dat <-read.table(text=" rown  varx
1   9F209
2  FL250
3  2F250
4  102250
5  102FL
6   102
7  1212FL250
8  121FL50",header=TRUE,stringsAsFactors=F)

1  9F209   # keep
2  FL250   # remove, no area code
3   2F250  # keep
4  102250 # remove , no region code
5  102FL   # remove , no numeric after region code
6   102  # remove ,  no region code and numeric
7  1212FL250  #remove, area code is more than three digits
8  121FL50  # Keep

The desired output should be
1   9F209
3   2F250
8  121FL50

How do I do this in an efficient way?

Thank you in advance

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Remove a row containing a specific value for a column

2013-04-07 Thread arun
Hi,
DATA[DATA$A!="Blue1",]
# A B C D
#1 Red1 1 1 1
#3 Red2 1 1 1
#4 Red3 1 1 1
#or
DATA[!grepl("Blue1",DATA$A),]
# A B C D
#1 Red1 1 1 1
#3 Red2 1 1 1
#4 Red3 1 1 1
A.K.




- Original Message -
From: Beatriz González Domínguez 
To: r-help@r-project.org; R Help 
Cc: 
Sent: Sunday, April 7, 2013 2:31 PM
Subject: [R] Remove a row containing a specific value for a column

Dear all,

Could anyone help me with the following?

DATA <- data.frame(rbind(c("Red1", 1, 1, 1), c("Blue1", 1, 1, 1), c("Red2", 1, 
1, 1), c("Red3", 1, 1, 1)))
colnames(DATA) <- c("A", "B","C", "D")

#Option 1
DATA <- DATA[-2, ] #Same result I would like to achieve with Option 2

#Option 2 - I would like to do it in this way. Do you know how it could be done?
#DATA <- THE CODE WOULD SAY (Remove the row which contains the value "Blue1" in 
column "A")

Many thanks!

Bea
    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Remove a row containing a specific value for a column

2013-04-07 Thread Beatriz González Domínguez
Dear all,

Could anyone help me with the following?

DATA <- data.frame(rbind(c("Red1", 1, 1, 1), c("Blue1", 1, 1, 1), c("Red2", 1, 
1, 1), c("Red3", 1, 1, 1)))
colnames(DATA) <- c("A", "B","C", "D")

#Option 1
DATA <- DATA[-2, ] #Same result I would like to achieve with Option 2

#Option 2 - I would like to do it in this way. Do you know how it could be done?
#DATA <- THE CODE WOULD SAY (Remove the row which contains the value "Blue1" in 
column "A")

Many thanks!

Bea
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] remove a row from a dataframe, row names disappear - solution

2010-02-05 Thread Euan Reavie
Thank you Sarah.I'm glad it was a quick fix:

On Fri, Feb 5, 2010 at 8:50 AM, Sarah Goslee <> wrote:

> You're not only removing a row of data, you are invoking the default
> behavior of subset, which is to collapse the subsetted result to the
> smallest possible type, which in this case is a vector. Vectors have
> no rows, and thus no row names.
>
> You need the drop=FALSE argument, as in
> ENV <- ENV[-1, , drop=FALSE]
>
> Sarah
>
> On Fri, Feb 5, 2010 at 9:44 AM, Euan Reavie  wrote:
> > I find this odd because it doesn't appear to happen in larger datasets. I
> > have the following data set ENV with the first column set as row.names:
> >
> >> ENV
> > TPlog
> > 001S29H  0.601
> > 002S42H  0.602
> > 003S43S  0.779
> > 004S43S  0.702
> > 005S51H  0.978
> > 006S52P  2.718
> >
> > If I apply > ENV <- ENV[-1,]  # remove first row of data (right?)
> > ...ENV comes back as:
> >
> > [1] 0.602 0.779 0.702 0.978 2.718
> >
> > So I am losing the row name info. I also notice that, if the first two
> > values in the TPlog column are the same, both values are removed! What's
> > going on, and why does this same thing not happen in more complex
> datasets
> > with more than one column of values?
> >
> > Many thanks - Euan.
>
>
> --
> Sarah Goslee
> http://www.functionaldiversity.org
>



-- 

* The opinions expressed in this message do, in fact, represent the opinions
of my employers, their families, and everybody within a 10 mile radius.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] remove a row from a dataframe, row names disappear

2010-02-05 Thread Sarah Goslee
You're not only removing a row of data, you are invoking the default
behavior of subset, which is to collapse the subsetted result to the
smallest possible type, which in this case is a vector. Vectors have
no rows, and thus no row names.

You need the drop=FALSE argument, as in
ENV <- ENV[-1, , drop=FALSE]

Sarah

On Fri, Feb 5, 2010 at 9:44 AM, Euan Reavie  wrote:
> I find this odd because it doesn't appear to happen in larger datasets. I
> have the following data set ENV with the first column set as row.names:
>
>> ENV
>         TPlog
> 001S29H  0.601
> 002S42H  0.602
> 003S43S  0.779
> 004S43S  0.702
> 005S51H  0.978
> 006S52P  2.718
>
> If I apply > ENV <- ENV[-1,]  # remove first row of data (right?)
> ...ENV comes back as:
>
> [1] 0.602 0.779 0.702 0.978 2.718
>
> So I am losing the row name info. I also notice that, if the first two
> values in the TPlog column are the same, both values are removed! What's
> going on, and why does this same thing not happen in more complex datasets
> with more than one column of values?
>
> Many thanks - Euan.


-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] remove a row from a dataframe, row names disappear

2010-02-05 Thread Euan Reavie
I find this odd because it doesn't appear to happen in larger datasets. I
have the following data set ENV with the first column set as row.names:

> ENV
 TPlog
001S29H  0.601
002S42H  0.602
003S43S  0.779
004S43S  0.702
005S51H  0.978
006S52P  2.718

If I apply > ENV <- ENV[-1,]  # remove first row of data (right?)
...ENV comes back as:

[1] 0.602 0.779 0.702 0.978 2.718

So I am losing the row name info. I also notice that, if the first two
values in the TPlog column are the same, both values are removed! What's
going on, and why does this same thing not happen in more complex datasets
with more than one column of values?

Many thanks - Euan.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.