Re: [R] Problem dropping rows based on values in a column

2007-03-25 Thread Bill.Venables
I think you want

delete <- c(14772,14744)
jdata <- subset(jdata, !(PID %in% delete))



Bill Venables
CSIRO Laboratories
PO Box 120, Cleveland, 4163
AUSTRALIA
Office Phone (email preferred):   +61 7 3826 7251
Fax (if absolutely necessary):  +61 7 3826 7304
Mobile:   (I don't have one!)
Home Phone:+61 7 3286 7700
mailto:[EMAIL PROTECTED]
http://www.cmis.csiro.au/bill.venables/ 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of John Sorkin
Sent: Monday, 26 March 2007 12:19 PM
To: r-help@stat.math.ethz.ch
Subject: [R] Problem dropping rows based on values in a column

I am trying to drop rows of a dataframe based on values of the column
PID, but my strategy is not working. I hope someoen can tell me what I
am doing incorrectly.


# Values of PID column
> jdata[,"PID"]
 [1] 16608 16613 16355 16378 16371 16280 16211 16169 16025 11595 15883
15682 15617 15615 15212 14862 16539
[18] 12063 16755 16720 16400 16257 16209 16200 16144 11598 13594 15419
15589 15982 15825 15834 15491 15822
[35] 15803 15795 10202 15680 15587 15552 15588 15375 15492 15568 15196
10217 15396 15477 15446 15374 14092
[52] 14033 15141 14953 15473 10424 13445 14854 10481 14793 14744 14772

#Prepare to drop last two rows, rows that ahve 14744 and 14772 in the
PID column
> delete<-c(14772,14744)

#Try to delete last two rows, but as you will see, I am not able to drop
the last two rows.
> jdata[jdata$PID!=delete,"PID"]
 [1] 16608 16613 16355 16378 16371 16280 16211 16169 16025 11595 15883
15682 15617 15615 15212 14862 16539
[18] 12063 16755 16720 16400 16257 16209 16200 16144 11598 13594 15419
15589 15982 15825 15834 15491 15822
[35] 15803 15795 10202 15680 15587 15552 15588 15375 15492 15568 15196
10217 15396 15477 15446 15374 14092
[52] 14033 15141 14953 15473 10424 13445 14854 10481 14793 14744 14772
> 


Thanks,
John

John Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
Baltimore VA Medical Center GRECC,
University of Maryland School of Medicine Claude D. Pepper OAIC,
University of Maryland Clinical Nutrition Research Unit, and
Baltimore VA Center Stroke of Excellence

University of Maryland School of Medicine
Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524

(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)
[EMAIL PROTECTED]

Confidentiality Statement:
This email message, including any attachments, is for the\ s...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem dropping rows based on values in a column

2007-03-25 Thread Wensui Liu
Sorry, John
Marc's method is correct.

On 3/25/07, John Sorkin <[EMAIL PROTECTED]> wrote:
> I am trying to drop rows of a dataframe based on values of the column PID, 
> but my strategy is not working. I hope someoen can tell me what I am doing 
> incorrectly.
>
>
> # Values of PID column
> > jdata[,"PID"]
>  [1] 16608 16613 16355 16378 16371 16280 16211 16169 16025 11595 15883 15682 
> 15617 15615 15212 14862 16539
> [18] 12063 16755 16720 16400 16257 16209 16200 16144 11598 13594 15419 15589 
> 15982 15825 15834 15491 15822
> [35] 15803 15795 10202 15680 15587 15552 15588 15375 15492 15568 15196 10217 
> 15396 15477 15446 15374 14092
> [52] 14033 15141 14953 15473 10424 13445 14854 10481 14793 14744 14772
>
> #Prepare to drop last two rows, rows that ahve 14744 and 14772 in the PID 
> column
> > delete<-c(14772,14744)
>
> #Try to delete last two rows, but as you will see, I am not able to drop the 
> last two rows.
> > jdata[jdata$PID!=delete,"PID"]
>  [1] 16608 16613 16355 16378 16371 16280 16211 16169 16025 11595 15883 15682 
> 15617 15615 15212 14862 16539
> [18] 12063 16755 16720 16400 16257 16209 16200 16144 11598 13594 15419 15589 
> 15982 15825 15834 15491 15822
> [35] 15803 15795 10202 15680 15587 15552 15588 15375 15492 15568 15196 10217 
> 15396 15477 15446 15374 14092
> [52] 14033 15141 14953 15473 10424 13445 14854 10481 14793 14744 14772
> >
>
>
> Thanks,
> John
>
> John Sorkin M.D., Ph.D.
> Chief, Biostatistics and Informatics
> Baltimore VA Medical Center GRECC,
> University of Maryland School of Medicine Claude D. Pepper OAIC,
> University of Maryland Clinical Nutrition Research Unit, and
> Baltimore VA Center Stroke of Excellence
>
> University of Maryland School of Medicine
> Division of Gerontology
> Baltimore VA Medical Center
> 10 North Greene Street
> GRECC (BT/18/GR)
> Baltimore, MD 21201-1524
>
> (Phone) 410-605-7119
> (Fax) 410-605-7913 (Please call phone number above prior to faxing)
> [EMAIL PROTECTED]
>
> Confidentiality Statement:
> This email message, including any attachments, is for the so...{{dropped}}
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
WenSui Liu
A lousy statistician who happens to know a little programming
(http://spaces.msn.com/statcompute/blog)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem dropping rows based on values in a column

2007-03-25 Thread Wensui Liu
> jdata
PID
1 14854
2 10481
3 14793
4 14744
5 14772
> jdata[jdata[1] != delete, 1]
[1] 14854 10481 14793


On 3/25/07, John Sorkin <[EMAIL PROTECTED]> wrote:
> I am trying to drop rows of a dataframe based on values of the column PID, 
> but my strategy is not working. I hope someoen can tell me what I am doing 
> incorrectly.
>
>
> # Values of PID column
> > jdata[,"PID"]
>  [1] 16608 16613 16355 16378 16371 16280 16211 16169 16025 11595 15883 15682 
> 15617 15615 15212 14862 16539
> [18] 12063 16755 16720 16400 16257 16209 16200 16144 11598 13594 15419 15589 
> 15982 15825 15834 15491 15822
> [35] 15803 15795 10202 15680 15587 15552 15588 15375 15492 15568 15196 10217 
> 15396 15477 15446 15374 14092
> [52] 14033 15141 14953 15473 10424 13445 14854 10481 14793 14744 14772
>
> #Prepare to drop last two rows, rows that ahve 14744 and 14772 in the PID 
> column
> > delete<-c(14772,14744)
>
> #Try to delete last two rows, but as you will see, I am not able to drop the 
> last two rows.
> > jdata[jdata$PID!=delete,"PID"]
>  [1] 16608 16613 16355 16378 16371 16280 16211 16169 16025 11595 15883 15682 
> 15617 15615 15212 14862 16539
> [18] 12063 16755 16720 16400 16257 16209 16200 16144 11598 13594 15419 15589 
> 15982 15825 15834 15491 15822
> [35] 15803 15795 10202 15680 15587 15552 15588 15375 15492 15568 15196 10217 
> 15396 15477 15446 15374 14092
> [52] 14033 15141 14953 15473 10424 13445 14854 10481 14793 14744 14772
> >
>
>
> Thanks,
> John
>
> John Sorkin M.D., Ph.D.
> Chief, Biostatistics and Informatics
> Baltimore VA Medical Center GRECC,
> University of Maryland School of Medicine Claude D. Pepper OAIC,
> University of Maryland Clinical Nutrition Research Unit, and
> Baltimore VA Center Stroke of Excellence
>
> University of Maryland School of Medicine
> Division of Gerontology
> Baltimore VA Medical Center
> 10 North Greene Street
> GRECC (BT/18/GR)
> Baltimore, MD 21201-1524
>
> (Phone) 410-605-7119
> (Fax) 410-605-7913 (Please call phone number above prior to faxing)
> [EMAIL PROTECTED]
>
> Confidentiality Statement:
> This email message, including any attachments, is for the so...{{dropped}}
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
WenSui Liu
A lousy statistician who happens to know a little programming
(http://spaces.msn.com/statcompute/blog)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem dropping rows based on values in a column

2007-03-25 Thread Marc Schwartz
On Sun, 2007-03-25 at 22:19 -0400, John Sorkin wrote:
> I am trying to drop rows of a dataframe based on values of the column PID, 
> but my strategy is not working. I hope someoen can tell me what I am doing 
> incorrectly.
> 
> 
> # Values of PID column
> > jdata[,"PID"]
>  [1] 16608 16613 16355 16378 16371 16280 16211 16169 16025 11595 15883 15682 
> 15617 15615 15212 14862 16539
> [18] 12063 16755 16720 16400 16257 16209 16200 16144 11598 13594 15419 15589 
> 15982 15825 15834 15491 15822
> [35] 15803 15795 10202 15680 15587 15552 15588 15375 15492 15568 15196 10217 
> 15396 15477 15446 15374 14092
> [52] 14033 15141 14953 15473 10424 13445 14854 10481 14793 14744 14772
> 
> #Prepare to drop last two rows, rows that ahve 14744 and 14772 in the PID 
> column
> > delete<-c(14772,14744)
> 
> #Try to delete last two rows, but as you will see, I am not able to drop the 
> last two rows.
> > jdata[jdata$PID!=delete,"PID"]
>  [1] 16608 16613 16355 16378 16371 16280 16211 16169 16025 11595 15883 15682 
> 15617 15615 15212 14862 16539
> [18] 12063 16755 16720 16400 16257 16209 16200 16144 11598 13594 15419 15589 
> 15982 15825 15834 15491 15822
> [35] 15803 15795 10202 15680 15587 15552 15588 15375 15492 15568 15196 10217 
> 15396 15477 15446 15374 14092
> [52] 14033 15141 14953 15473 10424 13445 14854 10481 14793 14744 14772
> > 

John,

If you had:

  delete <- c(14744, 14773)

it would likely work, but only in this particular setting where you are
comparing two sequential values. 

That is because you are testing a sequence of two values and the way
that you have them above, they are reversed from the order in which the
values actually appear.

For example:

Vec <- 1:10
delete <- 10:9

> Vec[Vec != delete]
 [1]  1  2  3  4  5  6  7  8  9 10


However:

delete <- 9:10

> Vec[Vec != delete]
[1] 1 2 3 4 5 6 7 8


Note what happens when the values in the source vector are not
sequential:

Vec <- sample(10)

> Vec
 [1]  5  1  7  3 10  8  2  6  9  4

delete <- 9:10

> Vec[Vec != delete]
[1]  5  1  7  3 10  8  2  6  4

delete <- 10:9

> Vec[Vec != delete]
[1] 5 1 7 3 8 2 6 9 4


You get a result in which the first value in 'delete' is removed, but
not the second.


When performing a logical comparison of a value to see if it is (or is
not) in a set of values, you want to use '%in%':

Vec <- 1:10

delete <- 10:9

> Vec[!Vec %in% delete]
[1] 1 2 3 4 5 6 7 8

delete <- 9:10

> Vec[!Vec %in% delete]
[1] 1 2 3 4 5 6 7 8


It also works in the permuted vector:

> Vec[!Vec %in% delete]
[1] 5 1 7 3 8 2 6 4


See ?"%in%" for more information.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem dropping rows based on values in a column

2007-03-25 Thread John Sorkin
I am trying to drop rows of a dataframe based on values of the column PID, but 
my strategy is not working. I hope someoen can tell me what I am doing 
incorrectly.


# Values of PID column
> jdata[,"PID"]
 [1] 16608 16613 16355 16378 16371 16280 16211 16169 16025 11595 15883 15682 
15617 15615 15212 14862 16539
[18] 12063 16755 16720 16400 16257 16209 16200 16144 11598 13594 15419 15589 
15982 15825 15834 15491 15822
[35] 15803 15795 10202 15680 15587 15552 15588 15375 15492 15568 15196 10217 
15396 15477 15446 15374 14092
[52] 14033 15141 14953 15473 10424 13445 14854 10481 14793 14744 14772

#Prepare to drop last two rows, rows that ahve 14744 and 14772 in the PID column
> delete<-c(14772,14744)

#Try to delete last two rows, but as you will see, I am not able to drop the 
last two rows.
> jdata[jdata$PID!=delete,"PID"]
 [1] 16608 16613 16355 16378 16371 16280 16211 16169 16025 11595 15883 15682 
15617 15615 15212 14862 16539
[18] 12063 16755 16720 16400 16257 16209 16200 16144 11598 13594 15419 15589 
15982 15825 15834 15491 15822
[35] 15803 15795 10202 15680 15587 15552 15588 15375 15492 15568 15196 10217 
15396 15477 15446 15374 14092
[52] 14033 15141 14953 15473 10424 13445 14854 10481 14793 14744 14772
> 


Thanks,
John

John Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
Baltimore VA Medical Center GRECC,
University of Maryland School of Medicine Claude D. Pepper OAIC,
University of Maryland Clinical Nutrition Research Unit, and
Baltimore VA Center Stroke of Excellence

University of Maryland School of Medicine
Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524

(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)
[EMAIL PROTECTED]

Confidentiality Statement:
This email message, including any attachments, is for the so...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.