On 9/19/07, Karin Lagesen [EMAIL PROTECTED] wrote:
Sorry about this one being long, and I apologise beforehand if there
is something obvious here that I have missed. I am new to creating my
own functions in R, and I am uncertain of how they work.
I have a data set that I have read into a data frame:
gctable[1:5,]
refseq geometry X60_origin X60_terminus length kingdom
1 NC_009484 cir179 773000 3389227 Bacteria
2 NC_009484 cir179 773000 3389227 Bacteria
3 NC_009484 cir179 773000 3389227 Bacteria
4 NC_009484 cir179 773000 3389227 Bacteria
5 NC_009484 cir179 773000 3389227 Bacteria
grp feature gene begin dir gc_content replicor LEADLAG
1 Alphaproteobacteria CDS CDS 261 + 0.654244RIGHTLEAD
2 Alphaproteobacteria CDS CDS 1737 - 0.651408RIGHT LAG
3 Alphaproteobacteria CDS CDS 2902 + 0.607843RIGHTLEAD
4 Alphaproteobacteria CDS CDS 3693 + 0.617647RIGHTLEAD
5 Alphaproteobacteria CDS CDS 4227 + 0.699208RIGHTLEAD
Most of these columns are factors.
Now, I have a function that I would like to employ on this data
frame. Right now I cannot get it to work, and that seems to be due to
the columns in the data frame being factors. I tested it with a data
frame created from vectors, and it worked fine.
The function:
percentdistance - function(origin, terminus, length, begin, replicor){
print(c(origin, terminus, length, begin, repl))
d = 0
if (terminusorigin) {
if(replicor==LEFT) {
d = -((origin-begin)%%length)
}
else {
d = (begin-origin)
}
}
else {
if (replicor==LEFT) {
d=(origin-begin)
}
else{
d = -((begin-origin)%%length)
}
}
d/length*2
}
The error I get:
percentdistance(gctable$X60_origin, gctable$X60_terminus, gctable$length,
gctable$begin, gctable$replicor)
[1] 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87
87
[19] 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87
87
[37] 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87
87
[55] 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87
87
[73] 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87
87
[91] 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87
87
[109] 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87
87
[127] 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87 87
87
.[99919] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2
[99937] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2
[99955] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2
[99973] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2
[1] 2 2 2 2 2 2 2 2 2
[ reached getOption(max.print) -- omitted 8526091 entries ]]
Error in if (terminus origin) { : missing value where TRUE/FALSE needed
In addition: Warning messages:
1: not meaningful for factors in: Ops.factor(terminus, origin)
2: the condition has length 1 and only the first element will be used in:
if (terminus origin) {
This worked nice when the input were columns from a data frame created
from vectors.
I have also tried the different apply-functions, although I am
uncertain of which one would be appropriate here.
...
Karin
--
Karin Lagesen, PhD student
[EMAIL PROTECTED]
http://folk.uio.no/karinlag
Hej Karin!
A couple of things:
First, the first warning message tells you that:
1: not meaningful for factors in: Ops.factor(terminus, origin).
Thus, terminus and origin are factor variables, which cannot be
ordered. You have to convert
them to numerical variables (See the faq for HowTo)
The second warning message tells you that:
2: the condition has length 1 and only the first element will be
used in: if (terminus origin)
You are comparing two vectors, which generate a vector of TRUE/FALSE values.
The if statement need a single TRUE/FALSE value.
Either use a for loop:
for (i in 1:nrow(terminus)) {if terminus[i] origin[i]...}
or a nested ifelse statement (which is recommendable on such a big data set).
best,
Gustaf
--
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.