Re: [R] function on factors - how best to proceed
On 9/19/07, Karin Lagesen <[EMAIL PROTECTED]> wrote: > > Sorry about this one being long, and I apologise beforehand if there > is something obvious here that I have missed. I am new to creating my > own functions in R, and I am uncertain of how they work. > > I have a data set that I have read into a data frame: > > > gctable[1:5,] > refseq geometry X60_origin X60_terminus length kingdom > 1 NC_009484 cir179 773000 3389227 Bacteria > 2 NC_009484 cir179 773000 3389227 Bacteria > 3 NC_009484 cir179 773000 3389227 Bacteria > 4 NC_009484 cir179 773000 3389227 Bacteria > 5 NC_009484 cir179 773000 3389227 Bacteria > grp feature gene begin dir gc_content replicor LEADLAG > 1 Alphaproteobacteria CDS CDS 261 + 0.654244RIGHTLEAD > 2 Alphaproteobacteria CDS CDS 1737 - 0.651408RIGHT LAG > 3 Alphaproteobacteria CDS CDS 2902 + 0.607843RIGHTLEAD > 4 Alphaproteobacteria CDS CDS 3693 + 0.617647RIGHTLEAD > 5 Alphaproteobacteria CDS CDS 4227 + 0.699208RIGHTLEAD > > > > Most of these columns are factors. > > Now, I have a function that I would like to employ on this data > frame. Right now I cannot get it to work, and that seems to be due to > the columns in the data frame being factors. I tested it with a data > frame created from vectors, and it worked fine. > > The function: > > percentdistance <- function(origin, terminus, length, begin, replicor){ > print(c(origin, terminus, length, begin, repl)) > d = 0 > if (terminus>origin) { > if(replicor=="LEFT") { > d = -((origin-begin)%%length) > } > else { > d = (begin-origin) > } > } > else { > if (replicor=="LEFT") { > d=(origin-begin) > } > else{ > d = -((begin-origin)%%length) > } > } > d/length*2 > } > > The error I get: > > percentdistance(gctable$X60_origin, gctable$X60_terminus, gctable$length, > > gctable$begin, gctable$replicor) > [1] 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 > 87 >[19] 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 > 87 >[37] 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 > 87 >[55] 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 > 87 >[73] 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 > 87 >[91] 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 > 87 > [109] 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 > 87 > [127] 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 > 87 > .[99919] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 > 2 2 > [99937] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 > 2 > [99955] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 > 2 > [99973] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 > 2 > [1] 2 2 2 2 2 2 2 2 2 > [ reached getOption("max.print") -- omitted 8526091 entries ]] > Error in if (terminus > origin) { : missing value where TRUE/FALSE needed > In addition: Warning messages: > 1: > not meaningful for factors in: Ops.factor(terminus, origin) > 2: the condition has length > 1 and only the first element will be used in: > if (terminus > origin) { > > > > This worked nice when the input were columns from a data frame created > from vectors. > > I have also tried the different apply-functions, although I am > uncertain of which one would be appropriate here. > > ... > > Karin > -- > Karin Lagesen, PhD student > [EMAIL PROTECTED] > http://folk.uio.no/karinlag Hej Karin! A couple of things: First, the first warning message tells you that: 1: > not meaningful for factors in: Ops.factor(terminus, origin). Thus, terminus and origin are factor variables, which cannot be ordered. You have to convert them to numerical variables (See the faq for HowTo) The second warning message tells you that: 2: the condition has length > 1 and only the first element will be used in: if (terminus > origin) You are comparing two vectors, which generate a vector of TRUE/FALSE values. The "if" statement need a single TRUE/FALSE value. Either use a for loop: for (i in 1:nrow(terminus)) {if terminus[i]> origin[i]...} or a nested ifelse statement (which is recommendable on such a big data set). best, Gustaf -- Gustaf Rydevik, M.Sci. tel: +46(0)703 051 451 address:Essingetorget 40,112 66 Stockholm, SE skype:gustaf_rydevik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] function on factors - how best to proceed
On 9/19/07, Gustaf Rydevik <[EMAIL PROTECTED]> wrote: > On 9/19/07, Karin Lagesen <[EMAIL PROTECTED]> wrote: > > "Gustaf Rydevik" <[EMAIL PROTECTED]> writes: > > > > > > > The second warning message tells you that: > > > 2: the condition has length > 1 and only the first element will be > > > used in: if (terminus > origin) > > > > > > You are comparing two vectors, which generate a vector of TRUE/FALSE > > > values. > > > The "if" statement need a single TRUE/FALSE value. > > > Either use a for loop: > > > for (i in 1:nrow(terminus)) {if terminus[i]> origin[i]...} > > > or a nested ifelse statement (which is recommendable on such a big data > > > set). > > > > Thankyou for your reply! I will certainly try the numeric thing. > > > > Now, for the vector comparison. I can easily see how you would do a > > for loop here, but I am unable to see how a nested ifelse statement > > would work. Could you possibly give me an example? > > > > Thankyou again for your help! > > > > Karin > > -- > > Karin Lagesen, PhD student > > [EMAIL PROTECTED] > > http://folk.uio.no/karinlag > > > > You replace each instance of "if" with ifelse, inserting a comma after > the logical test, and instead of the else statement. The end result > would become (if I've not made a mistake): > > _ > replicator<-rep(c("LEFT","RIGHT"),50) > terminus<-rnorm(100) > origin<-rnorm(100) > begin<-rnorm(100) > length<-sample(1:100,100,replace=T) > > d<-ifelse(terminus>origin, > +ifelse(replicator=="LEFT",-((origin-begin))%%length),(begin-origin)), > +ifelse(replicator=="LEFT",(origin-begin),-((begin-origin)%%length)) > +) > > /Gustaf > > > -- > Gustaf Rydevik, M.Sci. > tel: +46(0)703 051 451 > address:Essingetorget 40,112 66 Stockholm, SE > skype:gustaf_rydevik > Sorry, forgot to remove the plusses, and had a parenthesis wrong... __ replicator<-rep(c("LEFT","RIGHT"),50) terminus<-rnorm(100) origin<-rnorm(100) begin<-rnorm(100) length<-sample(1:100,100,replace=T) d<-ifelse(terminus>origin, ifelse(replicator=="LEFT",-((origin-begin)%%length),(begin-origin)), ifelse(replicator=="LEFT",(origin-begin),-((begin-origin)%%length)) ) ___ best, Gustaf -- Gustaf Rydevik, M.Sci. tel: +46(0)703 051 451 address:Essingetorget 40,112 66 Stockholm, SE skype:gustaf_rydevik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] function on factors - how best to proceed
On 9/19/07, Karin Lagesen <[EMAIL PROTECTED]> wrote: > "Gustaf Rydevik" <[EMAIL PROTECTED]> writes: > > > > The second warning message tells you that: > > 2: the condition has length > 1 and only the first element will be > > used in: if (terminus > origin) > > > > You are comparing two vectors, which generate a vector of TRUE/FALSE > > values. > > The "if" statement need a single TRUE/FALSE value. > > Either use a for loop: > > for (i in 1:nrow(terminus)) {if terminus[i]> origin[i]...} > > or a nested ifelse statement (which is recommendable on such a big data > > set). > > Thankyou for your reply! I will certainly try the numeric thing. > > Now, for the vector comparison. I can easily see how you would do a > for loop here, but I am unable to see how a nested ifelse statement > would work. Could you possibly give me an example? > > Thankyou again for your help! > > Karin > -- > Karin Lagesen, PhD student > [EMAIL PROTECTED] > http://folk.uio.no/karinlag > You replace each instance of "if" with ifelse, inserting a comma after the logical test, and instead of the else statement. The end result would become (if I've not made a mistake): _ replicator<-rep(c("LEFT","RIGHT"),50) terminus<-rnorm(100) origin<-rnorm(100) begin<-rnorm(100) length<-sample(1:100,100,replace=T) d<-ifelse(terminus>origin, +ifelse(replicator=="LEFT",-((origin-begin))%%length),(begin-origin)), +ifelse(replicator=="LEFT",(origin-begin),-((begin-origin)%%length)) +) /Gustaf -- Gustaf Rydevik, M.Sci. tel: +46(0)703 051 451 address:Essingetorget 40,112 66 Stockholm, SE skype:gustaf_rydevik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.