On 9/19/07, Karin Lagesen <[EMAIL PROTECTED]> wrote: > > Sorry about this one being long, and I apologise beforehand if there > is something obvious here that I have missed. I am new to creating my > own functions in R, and I am uncertain of how they work. > > I have a data set that I have read into a data frame: > > > gctable[1:5,] > refseq geometry X60_origin X60_terminus length kingdom > 1 NC_009484 cir 1790000 773000 3389227 Bacteria > 2 NC_009484 cir 1790000 773000 3389227 Bacteria > 3 NC_009484 cir 1790000 773000 3389227 Bacteria > 4 NC_009484 cir 1790000 773000 3389227 Bacteria > 5 NC_009484 cir 1790000 773000 3389227 Bacteria > grp feature gene begin dir gc_content replicor LEADLAG > 1 Alphaproteobacteria CDS CDS 261 + 0.654244 RIGHT LEAD > 2 Alphaproteobacteria CDS CDS 1737 - 0.651408 RIGHT LAG > 3 Alphaproteobacteria CDS CDS 2902 + 0.607843 RIGHT LEAD > 4 Alphaproteobacteria CDS CDS 3693 + 0.617647 RIGHT LEAD > 5 Alphaproteobacteria CDS CDS 4227 + 0.699208 RIGHT LEAD > > > > Most of these columns are factors. > > Now, I have a function that I would like to employ on this data > frame. Right now I cannot get it to work, and that seems to be due to > the columns in the data frame being factors. I tested it with a data > frame created from vectors, and it worked fine. > > The function: > > percentdistance <- function(origin, terminus, length, begin, replicor){ > print(c(origin, terminus, length, begin, repl)) > d = 0 > if (terminus>origin) { > if(replicor=="LEFT") { > d = -((origin-begin)%%length) > } > else { > d = (begin-origin) > } > } > else { > if (replicor=="LEFT") { > d=(origin-begin) > } > else{ > d = -((begin-origin)%%length) > } > } > d/length*2 > } > > The error I get: > > percentdistance(gctable$X60_origin, gctable$X60_terminus, gctable$length, > > gctable$begin, gctable$replicor) > [1] 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 > 87 > [19] 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 > 87 > [37] 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 > 87 > [55] 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 > 87 > [73] 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 > 87 > [91] 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 > 87 > [109] 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 > 87 > [127] 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 > 87 > .....[99919] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 > 2 2 > [99937] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 > 2 > [99955] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 > 2 > [99973] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 > 2 > [99991] 2 2 2 2 2 2 2 2 2 > [ reached getOption("max.print") -- omitted 8526091 entries ]] > Error in if (terminus > origin) { : missing value where TRUE/FALSE needed > In addition: Warning messages: > 1: > not meaningful for factors in: Ops.factor(terminus, origin) > 2: the condition has length > 1 and only the first element will be used in: > if (terminus > origin) { > > > > This worked nice when the input were columns from a data frame created > from vectors. > > I have also tried the different apply-functions, although I am > uncertain of which one would be appropriate here. > > ... > > Karin > -- > Karin Lagesen, PhD student > [EMAIL PROTECTED] > http://folk.uio.no/karinlag
Hej Karin! A couple of things: First, the first warning message tells you that: 1: > not meaningful for factors in: Ops.factor(terminus, origin). Thus, terminus and origin are factor variables, which cannot be ordered. You have to convert them to numerical variables (See the faq for HowTo) The second warning message tells you that: 2: the condition has length > 1 and only the first element will be used in: if (terminus > origin) You are comparing two vectors, which generate a vector of TRUE/FALSE values. The "if" statement need a single TRUE/FALSE value. Either use a for loop: for (i in 1:nrow(terminus)) {if terminus[i]> origin[i]...} or a nested ifelse statement (which is recommendable on such a big data set). best, Gustaf -- Gustaf Rydevik, M.Sci. tel: +46(0)703 051 451 address:Essingetorget 40,112 66 Stockholm, SE skype:gustaf_rydevik ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.