See inline below.

Laetitia Schmid wrote:
Dear Steve,
my solution looks like it would work, but it does not.
I attached a text file with an extract of my data. Maybe you can try it yourself. I want to compare C1 with M1, C2 with M2, C3 with M3,,, for each column.
I do not really know what the problem is. R complains about a syntax error.
The function I am applying counts the common strings between the two. Greg Hirson helped me to write it.

lettermatch <- function(a, b) {
tb <- merge(as.data.frame(table(strsplit(a, ""))), as.data.frame(table(strsplit(b, ""))), by="Var1")
   sum(apply(tb[-1], 1, min))
}

For example for the second column I tried:

for (x in 1:(nrow(dat)-1)) {
a <- as.character(dat[(2x-1),1])

Shouldn't that be 2*x-1??

 -Peter Ehlers

b <- as.character(dat[(2x),1])
 lettermatch(a,b)
}

or

 a <- as.character(dat[seq(1, nrow(dat), by=2),2])
 b <- as.character(dat[seq(2, nrow(dat), by=2), 2])
 all.results <- lettermatch(a,b)

With "dat<-read.delim("data_lgs.txt",stringsAsFactors=FALSE)" I can leave the "as.character" away in the formula above.

Laetitia

Individuals    Seq1    Seq2    Seq3    Seq4
C1    GGGG    AATT    CCGG    CTTT
M1    GGGG    AAAA    GGGG    GGGG
C2    GGGG    AATT    CCGG    CTTT
M2    AGGG    AACT    CCGG    CGTT
C3    AGGG    AACT    CCGG    CGTT
M3    AGGG    AACT    CCGG    CGTT
C4    GGGG    AATT    CCGG    CCTT
M4    GGGG    AAAT    CGGG    CTTT
C5    AGGG    ACTT    CCCG    CTTT
M5    AGGG    CTTT    CCCC    CCTT
C6    AGGG    CTTT    CCCC    CCTT
M6    AAAG    CCTT    CCCC    CTTT
C7    AAAG    ACCC    CCCG    GTTT
M7    AAGG    AACC    CCGG    TTTT
C8    GGGG    AATT    CCGG    CCTT
M8    GGGG    AATT    CCGG    CCTT
C9    GGGG    AAAA    GGGG    TTTT
M9    GGGG    AAAA    GGGG    TTTT
C11    AGGG    AAAC    CGGG    GGTT
M11    GGGG    AATT    CCGG    CCTT



Am 11.01.2010 um 15:18 schrieb Steve Lianoglou:

Hi,

On Mon, Jan 11, 2010 at 8:41 AM, Laetitia Schmid <laeti...@gmt.su.se> wrote:
Hello World,
I have a function that makes pairwise comparisons between two strings. I would like to apply this function to my data (which consists of columns with different strings) in the way that it compares the first with the second entry, and then the third with the fourth, and then the fifth with the sixth, and so on down each column...
So (2x-1) and (2x) would be the different entries to be compared!

dat= my data:

for the first column: compare dat[(2x-1),1] with dat[(2x),1] and x would be 1:i, i=length(dat[,1])

I think the best way to do that is a loop:

a <- as.character(dat[(2x-1),1])
b <- as.character(dat[(2x),1])

for (i in 1:length(dat[,1]) my_function(a, b))

Can somebody help me to apply a function with a loop in the way I want to a column?

It seems as if you got it already, don't you?

for (x in 1:(nrow(dat)-1)) {
 a <- dat[(2x-1),1]
 b <- dat[(2x), 1]
 my_function(a,b)
}

Is there a specification of "tapply" for that?

I don't think so, but depending on what you want to do, the size of
your data, and the amount of RAM you have, it might be faster to
compare everything "at once" (assuming `my_function` can be
vectorized), for instance:

a <- dat[seq(1, nrow(dat), by=2),1]
b <- dat[seq(2, nrow(dat), by=2), 1]
all.results <- my_function(a,b)

Also, as an aside, I see you keep calling "as.character" on your data
when you extract it from your data.frame. Is your data being converted
to factors? You can look to set stringsAsFactors=FALSE if this is the
case and you are reading in data using read.table/delim/etc (see:
?read.table)

Hope that helps,

-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
| Memorial Sloan-Kettering Cancer Center
| Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Peter Ehlers
University of Calgary
403.202.3921

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to