Re: [R] help sub setting data frame

2009-10-22 Thread Ista Zahn
Hi Sean,
Comment in line below.

On Thu, Oct 22, 2009 at 5:39 PM, Sean MacEachern sean.mace...@gmail.com wrote:
 Hi,

 I'm running into a problem subsetting a data frame that I have never
 encountered before:

 dim(chkPd)
 [1] 3213    6

 df = head(chkPd)
 df
               PN        WB      Sire     Dam   MG SEX
 601      1001  715349   61710   61702   67    F
 969  1001_1  511092 616253 615037 168    F
 986  1002_1  511082 616253 623905 168    F
 667      1003  715617   61817   61441   67    F
 1361 1003_1 510711 635246 627321 168    F
 754       1004 715272   62356   61380  67     F


 dfb = chkPd[df$PN,]
 dfb
            PN     WB   Sire    Dam  MG  SEX
 1001    2114_1 510944 616294 614865 168    M
 NA        NA     NA   NA   NA  NA NA
 NA.1      NA     NA   NA   NA  NA NA
 1003    1130_1 510950 616294 619694 168    F
 NA.2      NA     NA   NA   NA  NA NA
 1004 2221-SHR2 510952 616294 619694 168    M


 I'm not sure why I'm getting this behaviour? By sub-setting the
 original data frame by PN I seem to be pulling out row numbers?
 Therefore I am only getting results where PN is less than the
 dimensions of the original data frame and of course nothing where PN
 has _ in the id. I have also tried using subset but haven't had any
 luck with that either.

That is the documented behavior as far as I can tell. See

?[.data.frame

Maybe my brain is going soft at the end of a long day, but I can't
tell what you're trying to do. Can you clarify?

-Ista



dfb = subset(chkPd, PN==df$PN)
 Warning message:
 In PN == df$PN :
  longer object length is not a multiple of shorter object length

 I wasn't aware that both the larger data frame had to be a multiple of
 the object you were sub-setting . In any case I would appreciate any
 insight into what I may be doing wrong.

 Cheers,

 Sean


 sessionInfo()
 R version 2.9.1 (2009-06-26)
 i386-apple-darwin8.11.1

 locale:
 en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

 attached base packages:
 [1] splines   stats     graphics  grDevices utils     datasets  methods   base

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help sub setting data frame

2009-10-22 Thread Sean MacEachern
Hi Ista,

I think I'm suffering long dayitis myself. You are probably right. I
don't use subset that often. I typically use brackets to subset
dataframes. Essentially what I am trying to do is take my original
dataframe (chkPd) and subset it using a smaller dataframe with some
matching PN IDs. They are only a few hundred rows different in size so
subset wouldn't be appropriate here. I'm just struggling to figure out
what's going wrong in my first example.
for instance if I try:
 df = data.frame('id'=c(1,2,3,4),'res'=c(10,10,20,20))
 dfb=df[1:2]
 dfc = df[dfb$id,]

I get something along the lines of what I'd expect where my new
dataframe is a subset of the original based on the matching ids I
specified in dfb$id. Is that wrong in my first example?

Cheers,

Sean

On Thu, Oct 22, 2009 at 4:55 PM, Ista Zahn istaz...@gmail.com wrote:
 Hi Sean,
 Comment in line below.

 On Thu, Oct 22, 2009 at 5:39 PM, Sean MacEachern sean.mace...@gmail.com 
 wrote:
 Hi,

 I'm running into a problem subsetting a data frame that I have never
 encountered before:

 dim(chkPd)
 [1] 3213    6

 df = head(chkPd)
 df
               PN        WB      Sire     Dam   MG SEX
 601      1001  715349   61710   61702   67    F
 969  1001_1  511092 616253 615037 168    F
 986  1002_1  511082 616253 623905 168    F
 667      1003  715617   61817   61441   67    F
 1361 1003_1 510711 635246 627321 168    F
 754       1004 715272   62356   61380  67     F


 dfb = chkPd[df$PN,]
 dfb
            PN     WB   Sire    Dam  MG  SEX
 1001    2114_1 510944 616294 614865 168    M
 NA        NA     NA   NA   NA  NA NA
 NA.1      NA     NA   NA   NA  NA NA
 1003    1130_1 510950 616294 619694 168    F
 NA.2      NA     NA   NA   NA  NA NA
 1004 2221-SHR2 510952 616294 619694 168    M


 I'm not sure why I'm getting this behaviour? By sub-setting the
 original data frame by PN I seem to be pulling out row numbers?
 Therefore I am only getting results where PN is less than the
 dimensions of the original data frame and of course nothing where PN
 has _ in the id. I have also tried using subset but haven't had any
 luck with that either.

 That is the documented behavior as far as I can tell. See

 ?[.data.frame

 Maybe my brain is going soft at the end of a long day, but I can't
 tell what you're trying to do. Can you clarify?

 -Ista



dfb = subset(chkPd, PN==df$PN)
 Warning message:
 In PN == df$PN :
  longer object length is not a multiple of shorter object length

 I wasn't aware that both the larger data frame had to be a multiple of
 the object you were sub-setting . In any case I would appreciate any
 insight into what I may be doing wrong.

 Cheers,

 Sean


 sessionInfo()
 R version 2.9.1 (2009-06-26)
 i386-apple-darwin8.11.1

 locale:
 en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

 attached base packages:
 [1] splines   stats     graphics  grDevices utils     datasets  methods   
 base

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Ista Zahn
 Graduate student
 University of Rochester
 Department of Clinical and Social Psychology
 http://yourpsyche.org


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help sub setting data frame

2009-10-22 Thread Ista Zahn
Is this what you want?

df = data.frame('id'=c(1:100),'res'=c(1001:1100))
dfb=df[1:10,]
dfc = df[df$id %in% dfb$id,]

Still not sure, but that's my best guess. Going back to your original
data you can try

 dfb = chkPd[chkPd$PN %in% df$PN,]

Hope it helps,
Ista

On Thu, Oct 22, 2009 at 6:10 PM, Sean MacEachern sean.mace...@gmail.com wrote:
 Hi Ista,

 I think I'm suffering long dayitis myself. You are probably right. I
 don't use subset that often. I typically use brackets to subset
 dataframes. Essentially what I am trying to do is take my original
 dataframe (chkPd) and subset it using a smaller dataframe with some
 matching PN IDs. They are only a few hundred rows different in size so
 subset wouldn't be appropriate here. I'm just struggling to figure out
 what's going wrong in my first example.
 for instance if I try:
 df = data.frame('id'=c(1,2,3,4),'res'=c(10,10,20,20))
 dfb=df[1:2]
 dfc = df[dfb$id,]

 I get something along the lines of what I'd expect where my new
 dataframe is a subset of the original based on the matching ids I
 specified in dfb$id. Is that wrong in my first example?

 Cheers,

 Sean

 On Thu, Oct 22, 2009 at 4:55 PM, Ista Zahn istaz...@gmail.com wrote:
 Hi Sean,
 Comment in line below.

 On Thu, Oct 22, 2009 at 5:39 PM, Sean MacEachern sean.mace...@gmail.com 
 wrote:
 Hi,

 I'm running into a problem subsetting a data frame that I have never
 encountered before:

 dim(chkPd)
 [1] 3213    6

 df = head(chkPd)
 df
               PN        WB      Sire     Dam   MG SEX
 601      1001  715349   61710   61702   67    F
 969  1001_1  511092 616253 615037 168    F
 986  1002_1  511082 616253 623905 168    F
 667      1003  715617   61817   61441   67    F
 1361 1003_1 510711 635246 627321 168    F
 754       1004 715272   62356   61380  67     F


 dfb = chkPd[df$PN,]
 dfb
            PN     WB   Sire    Dam  MG  SEX
 1001    2114_1 510944 616294 614865 168    M
 NA        NA     NA   NA   NA  NA NA
 NA.1      NA     NA   NA   NA  NA NA
 1003    1130_1 510950 616294 619694 168    F
 NA.2      NA     NA   NA   NA  NA NA
 1004 2221-SHR2 510952 616294 619694 168    M


 I'm not sure why I'm getting this behaviour? By sub-setting the
 original data frame by PN I seem to be pulling out row numbers?
 Therefore I am only getting results where PN is less than the
 dimensions of the original data frame and of course nothing where PN
 has _ in the id. I have also tried using subset but haven't had any
 luck with that either.

 That is the documented behavior as far as I can tell. See

 ?[.data.frame

 Maybe my brain is going soft at the end of a long day, but I can't
 tell what you're trying to do. Can you clarify?

 -Ista



dfb = subset(chkPd, PN==df$PN)
 Warning message:
 In PN == df$PN :
  longer object length is not a multiple of shorter object length

 I wasn't aware that both the larger data frame had to be a multiple of
 the object you were sub-setting . In any case I would appreciate any
 insight into what I may be doing wrong.

 Cheers,

 Sean


 sessionInfo()
 R version 2.9.1 (2009-06-26)
 i386-apple-darwin8.11.1

 locale:
 en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

 attached base packages:
 [1] splines   stats     graphics  grDevices utils     datasets  methods   
 base

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Ista Zahn
 Graduate student
 University of Rochester
 Department of Clinical and Social Psychology
 http://yourpsyche.org





-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help sub setting data frame

2009-10-22 Thread Sean MacEachern
Works perfectly!

Thanks to all who responded.

Sean

On Thu, Oct 22, 2009 at 6:24 PM, Ista Zahn istaz...@gmail.com wrote:
 Is this what you want?

 df = data.frame('id'=c(1:100),'res'=c(1001:1100))
 dfb=df[1:10,]
 dfc = df[df$id %in% dfb$id,]

 Still not sure, but that's my best guess. Going back to your original
 data you can try

  dfb = chkPd[chkPd$PN %in% df$PN,]

 Hope it helps,
 Ista

 On Thu, Oct 22, 2009 at 6:10 PM, Sean MacEachern sean.mace...@gmail.com 
 wrote:
 Hi Ista,

 I think I'm suffering long dayitis myself. You are probably right. I
 don't use subset that often. I typically use brackets to subset
 dataframes. Essentially what I am trying to do is take my original
 dataframe (chkPd) and subset it using a smaller dataframe with some
 matching PN IDs. They are only a few hundred rows different in size so
 subset wouldn't be appropriate here. I'm just struggling to figure out
 what's going wrong in my first example.
 for instance if I try:
 df = data.frame('id'=c(1,2,3,4),'res'=c(10,10,20,20))
 dfb=df[1:2]
 dfc = df[dfb$id,]

 I get something along the lines of what I'd expect where my new
 dataframe is a subset of the original based on the matching ids I
 specified in dfb$id. Is that wrong in my first example?

 Cheers,

 Sean

 On Thu, Oct 22, 2009 at 4:55 PM, Ista Zahn istaz...@gmail.com wrote:
 Hi Sean,
 Comment in line below.

 On Thu, Oct 22, 2009 at 5:39 PM, Sean MacEachern sean.mace...@gmail.com 
 wrote:
 Hi,

 I'm running into a problem subsetting a data frame that I have never
 encountered before:

 dim(chkPd)
 [1] 3213    6

 df = head(chkPd)
 df
               PN        WB      Sire     Dam   MG SEX
 601      1001  715349   61710   61702   67    F
 969  1001_1  511092 616253 615037 168    F
 986  1002_1  511082 616253 623905 168    F
 667      1003  715617   61817   61441   67    F
 1361 1003_1 510711 635246 627321 168    F
 754       1004 715272   62356   61380  67     F


 dfb = chkPd[df$PN,]
 dfb
            PN     WB   Sire    Dam  MG  SEX
 1001    2114_1 510944 616294 614865 168    M
 NA        NA     NA   NA   NA  NA NA
 NA.1      NA     NA   NA   NA  NA NA
 1003    1130_1 510950 616294 619694 168    F
 NA.2      NA     NA   NA   NA  NA NA
 1004 2221-SHR2 510952 616294 619694 168    M


 I'm not sure why I'm getting this behaviour? By sub-setting the
 original data frame by PN I seem to be pulling out row numbers?
 Therefore I am only getting results where PN is less than the
 dimensions of the original data frame and of course nothing where PN
 has _ in the id. I have also tried using subset but haven't had any
 luck with that either.

 That is the documented behavior as far as I can tell. See

 ?[.data.frame

 Maybe my brain is going soft at the end of a long day, but I can't
 tell what you're trying to do. Can you clarify?

 -Ista



dfb = subset(chkPd, PN==df$PN)
 Warning message:
 In PN == df$PN :
  longer object length is not a multiple of shorter object length

 I wasn't aware that both the larger data frame had to be a multiple of
 the object you were sub-setting . In any case I would appreciate any
 insight into what I may be doing wrong.

 Cheers,

 Sean


 sessionInfo()
 R version 2.9.1 (2009-06-26)
 i386-apple-darwin8.11.1

 locale:
 en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

 attached base packages:
 [1] splines   stats     graphics  grDevices utils     datasets  methods   
 base

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Ista Zahn
 Graduate student
 University of Rochester
 Department of Clinical and Social Psychology
 http://yourpsyche.org





 --
 Ista Zahn
 Graduate student
 University of Rochester
 Department of Clinical and Social Psychology
 http://yourpsyche.org


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.