I do not understand the value of using the rle function in your description, but the code below appears to produce the table you want.

Note that better support for the data.table package might be found at stackexchange as the documentation specifies.

x <- read.table( text=
"Dad Mum Child Group
AA RR RA A
AA RR RR A
AA AA AA B
AA AA AA B
RA AA RR B
RR AA RR B
AA AA AA B
AA AA RA C
AA AA RA C
AA RR RA C
", header=TRUE, stringsAsFactors=FALSE )

library(data.table)
DT <- data.table( x )
DT[ , cdad := as.integer( Dad %in% c( "AA", "RR" ) ) ]
DT[ , sumdad := 0L ]
DT[ 1==DT$cdad, sumdad := sum( cdad ), by=Group ]
DT[ , cdad := NULL ]
DT[ , cmum := as.integer( Mum %in% c( "AA", "RR" ) ) ]
DT[ , summum := 0L ]
DT[ 1==DT$cmum, summum := sum( cmum ), by=Group ]
DT[ , cmum := NULL ]
DT[ , cchild := as.integer( Child %in% c( "AA", "RR" ) ) ]
DT[ , sumchild := 0L ]
DT[ 1==DT$cchild, sumchild := sum( cchild ), by=Group ]
DT[ , cchild := NULL ]

DT
    Dad Mum Child Group sumdad summum sumchild
 1:  AA  RR    RA     A      2      2        0
 2:  AA  RR    RR     A      2      2        1
 3:  AA  AA    AA     B      4      5        5
 4:  AA  AA    AA     B      4      5        5
 5:  RA  AA    RR     B      0      5        5
 6:  RR  AA    RR     B      4      5        5
 7:  AA  AA    AA     B      4      5        5
 8:  AA  AA    RA     C      3      3        0
 9:  AA  AA    RA     C      3      3        0
10:  AA  RR    RA     C      3      3        0

On Tue, 30 Dec 2014, Kate Ignatius wrote:

I'm trying to use both these packages and wondering whether they are possible...

To make this simple, my ultimate goal is determine long stretches of
1s, but I want to do this within groups (hence using the data.table as
I use the "set key" option.  However, I'm I'm not having much luck
making this possible.

For example, for simplistic sake, I have the following data:

Dad Mum Child Group
AA RR RA A
AA RR RR A
AA AA AA B
AA AA AA B
RA AA RR B
RR AA RR B
AA AA AA B
AA AA RA C
AA AA RA C
AA RR RA  C

And the following code which I know works

hetdad <- as.numeric(x[c(1)]=="AA" | x[c(1)]=="RR")
sumdad <- rle(hetdad)$lengths[rle(hetdad)$values==1]

hetmum <- as.numeric(x[c(2)]=="AA" | x[c(2)]=="RR")
summum <- rle(hetmum)$lengths[rle(hetmum)$values==1]

hetchild <- as.numeric(x[c(3)]=="AA" | x[c(3)]=="RR")
sumchild <- rle(hetchild)$lengths[rle(hetchild)$values==1]

However, I wish to do the above code by Group (though this file is
millions of rows long and groups will be larger but just wanted to
simply the example).

I did something like this but of course I got an error:

LOH[,hetdad:=as.numeric(x[c(1)]=="AA" | x[c(1)]=="RR")]
LOH[,sumdad:=rle(hetdad)$lengths[rle(hetdad)$values==1],by=Group]
LOH[,hetmum:=as.numeric(x[c(2)]=="AA" | x[c(2)]=="RR")]
LOH[,summum:=rle(hetmum)$lengths[rle(hetmum)$values==1],by=Group]
LOH[,hetchild:=as.numeric(x[c(3)]=="AA" | x[c(3)]=="RR")]
LOH[,sumchild:=rle(hetchild)$lengths[rle(hetchild)$values==1],by=Group]

The reason being as I want to eventually have something like this:

Dad Mum Child Group sumdad summum sumchild
AA RR RA A 2 2 0
AA RR RR A 2 2 1
AA AA AA B 4 5 5
AA AA AA B 4 5 5
RA AA RR B 0 5 5
RR AA RR B 4 5 5
AA AA AA B 4 5 5
AA AA RA C 3 3 0
AA AA RA C 3 3 0
AA RR RA  C 3 3 0

That is, I would like to have the specific counts next to what I'm
consecutively counting per group.  So for Group A for dad there are 2
AAs,  there are two RRs for mum but only 1 AA or RR for the child and
that is RR (so the 1 is next to the RR and not the RA).

Can this be done?

K.

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnew...@dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                      Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to