Hi, On Fri, Dec 25, 2009 at 12:49 AM, Vishal Thapar <vishaltha...@gmail.com> wrote: > Hi Steve, > > Thank you so much for the reply. The response to your queries are: > What do these commands return over your data? > > 1. is(train500) > -->"data.frame" "list" "oldClass" "mpinput" "vector" > 2. is(train500$class) > --> "NULL" "OptionalFunction" "output" > 3. is(train500[1,5]) > --> "factor" "integer" "oldClass" "output" "numeric" "vector" > 4. is(testSeq) > --> "data.frame" "list" "oldClass" "mpinput" "vector" > 5. is(testSeq[1,5]) > -->"factor" "integer" "oldClass" "output" "numeric" "vector" > 6. is(testSeq$class) > --> "NULL" "OptionalFunction" "output" > > > >> How similar are we talking -- something is (obviously) off because >> using the promotergene dataset is quite straightforward: >> >> library(kernlab) >> data(promotergene) >> tr <- promotergene[1:90,] >> ts <- promotergene[91:106,] >> m <- ksvm(Class~., data=promotergene, kernel="rbfdot", kpar = >> "automatic", C = 60, cross = 3, prob.model = TRUE) >> p <- predict(m, ts) >> > Right. here is the first line from my training set: > Class V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18 V19 > V20 V21 V22 V23 V24 V25 V26 V27 V28 > 1 + T A A A C T T A T A A A T A T A A A A > C T T T T T A A T > V487 V488 V489 V490 V491 V492 V493 V494 V495 V496 V497 V498 V499 V500 > 1 G A T T T C A T T T T G T T > > Here is the first record for the promoter gene set: > > Class V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 > V21 V22 V23 V24 V25 V26 V27 V28 V29 V30 V31 V32 V33 V34 V35 V36 V37 V38 > 1 + g c c t t c t c c a a a a c g t g t > t t t t t g t t g t t a a t t c g g t > V39 V40 V41 V42 V43 V44 V45 V46 V47 V48 V49 V50 V51 V52 V53 V54 V55 V56 > V57 V58 > 1 g t a g a c t t g t a a a c c t a a > a t
I'm guessing the factors aren't comparable? See the lower vs. uppercase? Can you try to uppercase your data as you read it in? Eg, you're doing this: chr4Seq = scan(my.file,list("",""),nlines=2) while(length(chr4Seq[[1]])>0) { seqId = chr4Seq[[1]]; testSeq = as.data.frame(t(s2c(chr4Seq[[2]]))); testSeq=cbind(Class="-",testSeq); # this is optional, I added this later to see if having the Class in the record removes the error. predictSvm1 <- (predict(modelforSVM, testSeq)); print(predictSvm1); chr4Seq = scan(my.file,list("",""),nlines=2); } Call toupper() on the second line of the while loop: testSeq = as.data.frame(t(s2c(toupper(chr4Seq[[2]])))) And 2 questions: 1. What is your "s2c" function doing? 2. Why are you ending your lines with semi-colons? Hope that helps, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.