Dear Henrique, Thanks a lot for the help! I got this:
> f <- function(x) + { + cbind.data.frame(chr=unique(x$chr), + Start=min(x$pos), + End=max(x$pos), + Rows=nrow(x), + Pattern=paste("(", x$s1, x$s2, ")") + ) + } > > do.call("rbind", lapply(lapply(split(df, paste(df$s1, df$s2)), f), unique)) Error in split.default(df, paste(df$s1, df$s2)) : first argument must be a vector Any clue? Best, Allen On Dec 24, 2007 11:27 AM, Henrique Dallazuanna <[EMAIL PROTECTED]> wrote: > Try this: > > f <- function(x) > { > cbind.data.frame(chr=unique(x$chr), > Start=min(x$pos), > End=max(x$pos), > Rows=nrow(x), > Pattern=paste("(", x$s1, x$s2, ")") > ) > } > > do.call("rbind", lapply(lapply(split(df, paste(df$s1, df$s2)), f), > unique)) > > > On 24/12/2007, affy snp <[EMAIL PROTECTED]> wrote: > > Thanks Moshe! I apologize for not being so clear about the > > second part. Again, below is how the data looks like. The > > pattern for columns s1 and s2 will be: > > > > (-1 -1) (-1 0) (-1 1) (0 -1) (0 0) (0 1) (1 -1) (1 0) (1 1) > > 104 131 57 631 305 668 33 15 107 > > > > There are 9 patterns, in other words, 9 combinations of -1,1, 0 > > given in the parenthesis. The occurring numbers are underneath. > > What I wish to have is that: scan the data from the begin, > > if any consecutive rows are of the same pattern (one of the 9 > > combinations in the above), we will 'memorize' the following > information: > > > > the number in 'chr' column, the number in 'pos' column for the first > > row in the consecutive rows, the number in 'pos' column for the > > last row in the consecutive rows, how many rows of the consecutive > > rows, the corresponding pattern for them. > > > > I forgot to reinforce one requirement before for definition of > > the consecutive rows, which is that they are in the consecutive > > orders and are of the same number of 'chr'. > > > > Just to illustrate this, an example could be that, based on the data: > > > > BAC chr pos s1 s2 > > RP11-80G24 1 77465510 0 0 > > RP11-198H14 1 78696291 -1 0 > > RP11-267M21 1 79681704 -1 0 > > RP11-89A19 1 80950808 -1 0 > > RP11-6B16 1 82255496 -1 0 > > RP11-210E16 2 228801510 -1 0 > > > > even though row 2---6 are of the same pattern, which is -1 0 > > and are in the consecutive order, but row 6 is of different number > > of 'chr' than other rows. Therefore, we will not count row 6 and > > end up with: > > chr Start End #of_rows pattern > > 1 78696291 82255496 4 (-1 0) > > > > Hope this is clear. Thank you once again and Merry X'mas! > > > > Best, > > Allen > > > > > > > > > > > > > BAC chr pos s1 s2 > > > RP11-80G24 1 77465510 -1 0 > > > RP11-198H14 1 78696291 -1 0 > > > RP11-267M21 1 79681704 -1 0 > > > RP11-89A19 1 80950808 -1 0 > > > RP11-6B16 1 82255496 -1 0 > > > RP11-210E16 1 228801510 0 -1 > > > RP11-155C15 1 230957584 0 -1 > > > RP11-210F8 1 237932418 0 -1 > > > RP11-263L17 2 65724492 0 1 > > > RP11-340F16 2 65879898 0 1 > > > RP11-68A1 2 67718674 0 0 > > > RP11-474G23 2 68318411 0 0 > > > RP11-218N6 2 68454651 0 0 > > > CTD-2003M22 2 68567494 0 0 > > > ..... > > > > > > > On Dec 24, 2007 3:54 AM, Moshe Olshansky <[EMAIL PROTECTED]> wrote: > > > > > To answer your firs question try > > > > > > M[-which( M$s1 == 0 & M$s2 == 0),] > > > > > > For the second question, you must start with the more > > > precise definition of the grouping criterion. > > > > > > --- affy snp <[EMAIL PROTECTED]> wrote: > > > > > > > Hello list, > > > > > > > > I have a data frame M like: > > > > > > > > BAC chr pos s1 s2 > > > > RP11-80G24 1 77465510 -1 0 > > > > RP11-198H14 1 78696291 -1 0 > > > > RP11-267M21 1 79681704 -1 0 > > > > RP11-89A19 1 80950808 -1 0 > > > > RP11-6B16 1 82255496 -1 0 > > > > RP11-210E16 1 228801510 0 -1 > > > > RP11-155C15 1 230957584 0 -1 > > > > RP11-210F8 1 237932418 0 -1 > > > > RP11-263L17 2 65724492 0 1 > > > > RP11-340F16 2 65879898 0 1 > > > > RP11-68A1 2 67718674 0 0 > > > > RP11-474G23 2 68318411 0 0 > > > > RP11-218N6 2 68454651 0 0 > > > > CTD-2003M22 2 68567494 0 0 > > > > ..... > > > > > > > > how to remove those rows which have 0 for both of > > > > columns s1,s2? > > > > sth like M[!M$21=0&!M$s2=0]? > > > > > > > > Moreover, I want to get a list which could find a > > > > subset of rows which have > > > > the same pattern of data. For example, the first 8 > > > > rows in M can be > > > > clustered > > > > into 2 groups (represented below in 2 rows) and > > > > shown as: > > > > > > > > chr Start End # of > > > > rows Pattern > > > > 1 77465510 82255496 5 > > > > (-1 0) > > > > 1 228801510 237932418 3 > > > > (0 -1) > > > > > > > > Can anybody help me out of this? Thank you very much > > > > and happy holiday! > > > > > > > > Best, > > > > Allen > > > > > > > > [[alternative HTML version deleted]] > > > > > > > > ______________________________________________ > > > > R-help@r-project.org mailing list > > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > > PLEASE do read the posting guide > > > > http://www.R-project.org/posting-guide.html > > > > and provide commented, minimal, self-contained, > > > > reproducible code. > > > > > > > > > > > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > > -- > Henrique Dallazuanna > Curitiba-Paraná-Brasil > 25° 25' 40" S 49° 16' 22" O > [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.