from:"LWF"

[R] problem with predict(mboost,...)

2010-10-20 Thread LWF

Hi,

I use a mboost model to predict my dependent variable on new data. I get the 
following warning message:
In bs(mf[[i]], knots = args$knots[[i]]$knots, degree = args$degree,  :
   some 'x' values beyond boundary knots may cause ill-conditioned bases

The new predicted values are partly negative although the variable in the 
training data ranges from 3 to 8 on a numeric scale. In order to restrict the 
predicted values to the value range from 3 to 8 I limit the feature space of 
the prediction data on the minima and maxima of the training data for every 
predictor variable before applying the model on the new data.
As baselearner in mboost I use splines ("bbs"):

mod <- mboost(MF ~ bbs(predictor1) + bbs(predictor2) + bbs(...), data = train)

I wonder why there are negative values when applying the model on new data, 
because both, training and prediction data have the same value ranges in the 
predictor variables.

Did somebody get the same warning message? Can someone help me please?

TIM

-- 
Tim Häring
Bavarian State Institute of Forestry
Department of Forest Ecology
Hans-Carl-von-Carlowitz-Platz 1
D-85354 Freising

E-Mail: tim.haer...@lwf.bayern.de
http://www.lwf.bayern.de

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] nu-SVM crashes in e1071

2010-03-03 Thread LWF

(...)
 
> While you're sending your bug report to David, perhaps you can try the
> SVM from kernlab.
> 
> It relies on code from libsvm, too, but ... you never know. It can't
> hurt to try.

Hi Steve,

thanks for that hint.
I tried ksvm()-function bet get an error message:

model <- ksvm(soil_unit~., train, type="nu-svc")
Using automatic sigma estimation (sigest) for RBF or laplace kernel 
Error in votematrix[i, ret < 0] <- votematrix[i, ret < 0] + 1 : 
  NAs are not allowed in subscripted assignments

But there are no NAs in my dataset. I checked it with 
summary(is.na(train))

I sent an Email to David...

TIM

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] nu-SVM crashes in e1071

2010-03-02 Thread LWF


> > I`m using SVMs for multi-class classification problems. Therefore I`m
> using the svm() function in the package "e1071".
> > If I use svm(...type="C-classification") everything works fine. But
> if I want to use nu-SVM with svm(..., type="nu-classification", nu=0.5)
> R crashes immediately. No error message - just crash.
> >
> > Did anybody had the same problem and maybe a solution?
> > I`m using R 2.10.0 and the latest Version of e1071
> 
> 
> Maybe for your unstated OS with unstated version of e1071 on an
> outdated
> version of R without a reproducible example given.
> 
> For my WinXP, R-2.10.1, e1071 1.5-22 I get:
> 
> library(e1071)
> data(iris)
> model <- svm(Species ~ ., data = iris, type="nu-classification")
> model
> 

O.k. - sorry for my sparse information.
I just made an update to R-2.10.1 and e1071 version 1.5-22 on WinXP.
I can reproduce the example with the iris dataset. However R crashes when I 
call svm() with my dataset

model <- svm(soil_unit ~ ., data = traindat, type="nu-classification")

My dataset consists of 9259 obs. of 14 variables. My target variable is a 
factor variable with 22 levels (multi-class classification). Predictors are 12 
numeric and 1 factor variables.

Hoping this information is enough.

TIM

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] nu-SVM crashes in e1071

2010-03-02 Thread LWF

Hello !

I`m using SVMs for multi-class classification problems. Therefore I`m using the 
svm() function in the package "e1071".
If I use svm(...type="C-classification") everything works fine. But if I want 
to use nu-SVM with svm(..., type="nu-classification", nu=0.5) R crashes 
immediately. No error message - just crash.

Did anybody had the same problem and maybe a solution? 
I`m using R 2.10.0 and the latest Version of e1071

Thanks
TIM 

BTW: Using the LibSVM wrapper in Weka the same happens. Maybe there is a 
problem in the LibSVM code...

---
 
Tim Häring
Bavarian State Institute of Forest Research 
Department of Forest Ecology
Hans-Carl-von-Carlowitz-Platz 1
D-85354 Freising

E-Mail: tim.haer...@lwf.bayern.de
http://www.lwf.bayern.de

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] different randomForest performance for same data

2009-12-10 Thread LWF

Hello,

I came across a problem when building a randomForest model. Maybe someone can 
help me.
I have a training- and a testdataset with a discrete response and ten 
predictors (numeric and factor variables). The two datasets are similar in 
terms of number of predictor, name of variables and datatype of variables 
(factor, numeric) except that only one predictor has got 20 levels in the 
training dataset and only 19 levels in the test dataset.
I found that the model performance is different when train and test a model 
with the unchanged datasets on the one hand and after assigning the levels of 
the training dataset on the testdataset. I only assign the levels and do not 
change the dataset itself however the models perform different.
Why???

Here is my code:
> library(randomForest)
> load("datasets.RData")  # import traindat and testdat
> nlevels(traindat$predictor1)
[1] 20
> nlevels(testdat$predictor1)
[1] 19
> nrow(traindat)
[1] 9838
> nrow(testdat)
[1] 3841
> set.seed(10)
> rf_orig <- randomForest(x=traindat[,-1], y=traindat[,1], xtest=testdat[,-1], 
> ytest=testdat[,1],ntree=100)
> data.frame(rf_orig$test$err.rate)[100,1]  # Error on test-dataset
[1] 0.3082531

# assign the levels of the training dataset th the test dataset for predictor 1
> levels(testdat$predictor1) <- levels(traindat$predictor1)  
> nlevels(traindat$predictor1)
[1] 20
> nlevels(testdat$predictor1)
[1] 20
> nrow(traindat)
[1] 9838
> nrow(testdat)
[1] 3841
> set.seed(10)
> rf_mod <- randomForest(x=traindat[,-1], y=traindat[,1], xtest=testdat[,-1], 
> ytest=testdat[,1],ntree=100)
> data.frame(rf_mod$test$err.rate)[100,1]   # Error on test-dataset
[1] 0.4808644  # is different

Cheers,
TIM

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] different model performance because of nlevels()???

2009-12-07 Thread LWF

Hello everybody,

I came across a problem when building a randomForest model. Maybe someone can 
help me.
I have a training- and a testdataset with a discrete response and ten 
predictors (numeric and factor variables). The two datasets are similar in 
terms of number of predictor, name of variables and datatype of variables 
(factor, numeric) except that only one predictor has got 20 levels in the 
training dataset and only 19 levels in the test dataset.
I found that the model performance is different when train and test a model 
with the unchanged datasets on the one hand and after assigning the levels of 
the training dataset on the testdataset. I only assign the levels and do not 
change the dataset itself however the models perform different.
Why???

Here is my code:
> library(randomForest)
> load("datasets.RData")  # import traindat and testdat
> nlevels(traindat$predictor1)
[1] 20
> nlevels(testdat$predictor1)
[1] 19
> nrow(traindat)
[1] 9838
> nrow(testdat)
[1] 3841
> set.seed(10)
> rf_orig <- randomForest(x=traindat[,-1], y=traindat[,1], xtest=testdat[,-1], 
> ytest=testdat[,1],ntree=100)
> data.frame(rf_orig$test$err.rate)[100,1]  # Error on test-dataset
[1] 0.3082531

# assign the levels of the training dataset th the test dataset for predictor 1
> levels(testdat$predictor1) <- levels(traindat$predictor1)  
> nlevels(traindat$predictor1)
[1] 20
> nlevels(testdat$predictor1)
[1] 20
> nrow(traindat)
[1] 9838
> nrow(testdat)
[1] 3841
> set.seed(10)
> rf_mod <- randomForest(x=traindat[,-1], y=traindat[,1], xtest=testdat[,-1], 
> ytest=testdat[,1],ntree=100)
> data.frame(rf_mod$test$err.rate)[100,1]   # Error on test-dataset
[1] 0.4808644  # is different

Cheers,
TIM

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] creating lists in a list with loop

2009-09-04 Thread Häring, Tim (LWF)

Hello !

I want to create a spatial stratified sampling scheme with the package 
spsurvey. To do this with the function "grts" in spsurvey, I need to create a 
list containing the specifications for each stratum. This specifications were 
stored in a named list, where the name for each stratum is the name for each 
list. This means, I need to create a "outer" list containing several "inner" 
lists.

> Stratdsgn <- list("Stratum1"=list(panel=c(Panel=6), seltype="Equal"),
+   "Stratum2"=list(panel=c(Panel=2), seltype="Equal"),
+   "Stratum3"=list(panel=c(Panel=4), seltype="Equal"))
> str(Stratdsgn)
List of 3
 $ Stratum1:List of 2
  ..$ panel  : Named num 6
  .. ..- attr(*, "names")= chr "Panel"
  ..$ seltype: chr "Equal"
 $ Stratum2:List of 2
  ..$ panel  : Named num 2
  .. ..- attr(*, "names")= chr "Panel"
  ..$ seltype: chr "Equal"
 $ Stratum3:List of 2
  ..$ panel  : Named num 4
  .. ..- attr(*, "names")= chr "Panel"
  ..$ seltype: chr "Equal"

Because I do not have only 3 strata but 50 or more I would like to create this 
list with a for loop.
Could somebody help me to do this? I didn`t manged to create a list within a 
list in a loop.

Thanks for every hint.

TIM

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] feature weighting in randomForest

2009-08-05 Thread Häring, Tim (LWF)

Hello !

I´m using randomForest for classifacation problems. My dataset has 21.000 
observations and 96 predictors. I know that some predictors of my dataset have 
more influence to classify my data than others.
Therefore I would like to know if there is a way to weight my predictors. I 
know that for constructing each tree in a forest the most influencial predictor 
is used for partitioning the data. But maybe it would have an effect if I 
weight my predictors. 

Thanks in advance

TIM

---
 
Tim Haering
Bavarian State Institute of Forest Research 
Department of Forest Ecology
Am Hochanger 11
D-85354 Freising

http://www.lwf.bayern.de

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] "prob" in predict(randomForest)

2009-05-05 Thread LWF

Hi at all,

maybe this question is quite simple for a statistician, but for me it is not. 
After reading a lot of mail in the R-help archive I`m still not quite sure I 
get it. 
When applying a randomForest to a new dataset with predict(randomForest) I have 
the option to get the output as probability (classification problem):
predict(myrf,...,type="prob")
I would like to know how I have to understand this output. Are this values the 
probability of an observation belonging to a predicted class? Say, I have a 
data-point as newdata, my rf-model predicts Class A and the probability is 
0,12301. Does this mean that this data-point belongs to class A only with a 
probability of 12%?

Thanks for every hint.

TIM

Just as a matter of form: I´m using R version 2.8.1, randomForest package 
4.5-28, OS: WinXP

---
 
Tim Häring
Bavarian State Institute of Forest Research 
Department of Forest Ecology
Am Hochanger 11
D-85354 Freising

E-Mail: tim.haer...@lwf.bayern.de
http://www.lwf.bayern.de

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] gbm for multi-class problems

2009-04-07 Thread LWF

Dear List,

 

I´m working on a classification problem. My response has 60 levels.

I`m very interested in boosted trees like AdaBoost or  gradient boosting 
machine as implemented in the package "gbm". Unfortunately gbm is only 
applicable for 2-class problems.

Is anybody out there who can help me? Is there a way to use gbm() for 
multi-class problems? Maybe there is a way to transform my dataset in 60 
datasets with binary response to make a one-against-all classification but I 
didn`t find anything on the R-project homepage or in Google.

 

Thanks for every help.

 

TIM

 

---
 

Tim Häring

Bavarian State Institute of Forest Research 

Department of Forest Ecology

Am Hochanger 11

D-85354 Freising



E-Mail: tim.haer...@lwf.bayern.de

http://www.lwf.bayern.de




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] dimnames in pkg "ipred"

2009-01-23 Thread Häring, Tim (LWF)

I think I solved the problem =)
My dataset is an .arff file. So I read my data into R via read.arff.
I tried the following:
Export the dataframe to an txt-file and import it once again in R via 
read.table.
With the new dataset if works fine. Maybe the error comes from the 
variable-names. I attached a txt-file containing the str(traindat.bin) output 
from the data.frame, which I import via read.arff

Cheers,
TIM


-Ursprüngliche Nachricht-
Von: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de] 
Gesendet: Friday, January 23, 2009 11:03 AM
An: Häring, Tim (LWF)
Betreff: Re: AW: [R] dimnames in pkg "ipred"



Häring, Tim (LWF) wrote:
> OK, the information I send to the list were rather sparse. Sorry for that!
> I just tried the command with the recent Version of R and ipred. The error 
> message is the same.
> I want to create a classification model. My data consist of 5414 observations 
> and 98 variables whereof 33 are numeric, the remainder are binary nominal 
> (factor) variables. My output SOIL_UNIT is a factor variable with 82 levels.
> 
> I hope this are enough information to understand the problem.

What does str(traindat.bin) tell you? Is it a data.frame?
Can you reduce the data.frame in a way (less variables and observations) 
so that you can send the rest by e-mail and we can see the error?

Uwe Ligges


> Cheers,
> TIM
> 
> 
> 
> -Ursprüngliche Nachricht-
> Von: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de] 
> Gesendet: Thursday, January 22, 2009 6:49 PM
> An: Häring, Tim (LWF)
> Cc: r-help@r-project.org
> Betreff: Re: [R] dimnames in pkg "ipred"
> 
> 
> 
> Häring, Tim (LWF) wrote:
>> Hello List,
>>
>>  
>>
>> I`m trying to make prediction using a bagged tree with the package ipred. I 
>> tried to follow the manual but I`m getting an error message. Also browsing 
>> through the list-archive I didn`t find any hint. 
>>
>> Maybe someone can help me?
>>
>>  
>>
>> selbag <- bagging(SOIL_UNIT ~., data=traindat.bin, coob=TRUE)
>>
>> Error in dimnames(X) <- list(dn[[1L]], unlist(collabs, use.names = FALSE)) : 
>>
>>   length of 'dimnames' [2] not equal to array extent
>>
>>  
>>
>> I´m using R 2.7.2 on Win XP and the latest version of ipred.
> 
> 
> Please do read the posting guide.
> 
> - We do not have "traindat.bin", hence cannot reproduce your problem
> - Does it happen with recent versions of R and ipred?
> 
> Best,
> Uwe Ligges
> 
> 
> 
> 
>>  
>>
>> Thanks a lot.
>>
>> TIM
>>
>>  
>>
>> ---
>>  
>>
>> Dipl.-Geogr. Tim Häring
>>
>> Sachgebiet Standort und Bodenschutz (SG 2.1)
>>
>> Bayerische Landesanstalt für Wald und Forstwirtschaft
>>
>> Am Hochanger 11
>>
>> D-85354 Freising
>>
>>
>>
>> Tel.: +49-(0)8161/71-4769
>>
>> E-Mail: tim.haer...@lwf.bayern.de
>>
>> http://www.lwf.bayern.de
>>
>>
>>
>>
>>
>>
>>  [[alternative HTML version deleted]]
>>
>>
>>
>> 
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> str(traindat.bin)
'data.frame':   5414 obs. of  98 variables:
 $ SOIL_UNIT : Factor w/ 82 levels 
"17b","19a","19b",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ GEOL_UNIT=Allgaeuschichten: Factor w/ 2 levels 
"f","t": 1 1 1 1 1 1 1 1 1 1 ...
 $ GEOL_UNIT=Anmooriger_Boden: Factor w/ 2 levels 
"f","t": 1 1 1 1 1 1 1 1 1 1 ...
 $ GEOL_UNIT=Aptychenschichten   : Factor w/ 2 levels 
"f","t": 1 1 1 1 1 1 1 1 1 1 ...
 $ GEOL_UNIT=Bachschuttkegel : Factor w/ 2 levels 
"f","t": 1 1 1 1 1 1 1 1 1 1 ...
 $ GEOL_UNIT=Baustein-Schichten  : Factor w/ 2 levels 
"f","t": 1 1 1 1 1 1 1 1 1 1 ...
 $ GEOL_UNIT=Baustein-Schichten_Nagelfluh: Factor w/ 2 levels 
"f","t": 1 1 1 1 1 1 1 1 1 1 ...
 $ GEOL_UNIT=Bergschlipf : Factor w/ 2 levels 
"f","t": 1 1 1 1 1 1 1 1 1 1 ...
 $ GEOL_UNIT=Bergsturz

Re: [R] dimnames in pkg "ipred"

2009-01-22 Thread Häring, Tim (LWF)

OK, the information I send to the list were rather sparse. Sorry for that!
I just tried the command with the recent Version of R and ipred. The error 
message is the same.
I want to create a classification model. My data consist of 5414 observations 
and 98 variables whereof 33 are numeric, the remainder are binary nominal 
(factor) variables. My output SOIL_UNIT is a factor variable with 82 levels.

I hope this are enough information to understand the problem.

Cheers,
TIM

-Ursprüngliche Nachricht-
Von: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de] 
Gesendet: Thursday, January 22, 2009 6:49 PM
An: Häring, Tim (LWF)
Cc: r-help@r-project.org
Betreff: Re: [R] dimnames in pkg "ipred"

Häring, Tim (LWF) wrote:
> Hello List,
> 
>  
> 
> I`m trying to make prediction using a bagged tree with the package ipred. I 
> tried to follow the manual but I`m getting an error message. Also browsing 
> through the list-archive I didn`t find any hint. 
> 
> Maybe someone can help me?
> 
>  
> 
> selbag <- bagging(SOIL_UNIT ~., data=traindat.bin, coob=TRUE)
> 
> Error in dimnames(X) <- list(dn[[1L]], unlist(collabs, use.names = FALSE)) : 
> 
>   length of 'dimnames' [2] not equal to array extent
> 
>  
> 
> I´m using R 2.7.2 on Win XP and the latest version of ipred.

Please do read the posting guide.

- We do not have "traindat.bin", hence cannot reproduce your problem
- Does it happen with recent versions of R and ipred?

Best,
Uwe Ligges

>  
> 
> Thanks a lot.
> 
> TIM
> 
>  
> 
> ---
>  
> 
> Dipl.-Geogr. Tim Häring
> 
> Sachgebiet Standort und Bodenschutz (SG 2.1)
> 
> Bayerische Landesanstalt für Wald und Forstwirtschaft
> 
> Am Hochanger 11
> 
> D-85354 Freising
> 
> 
> 
> Tel.: +49-(0)8161/71-4769
> 
> E-Mail: tim.haer...@lwf.bayern.de
> 
> http://www.lwf.bayern.de
> 
> 
> 
> 
> 
> 
>   [[alternative HTML version deleted]]
> 
> 
> 
> 
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] dimnames in pkg "ipred"

2009-01-22 Thread Häring, Tim (LWF)

Hello List,

 

I`m trying to make prediction using a bagged tree with the package ipred. I 
tried to follow the manual but I`m getting an error message. Also browsing 
through the list-archive I didn`t find any hint. 

Maybe someone can help me?

 

selbag <- bagging(SOIL_UNIT ~., data=traindat.bin, coob=TRUE)

Error in dimnames(X) <- list(dn[[1L]], unlist(collabs, use.names = FALSE)) : 

  length of 'dimnames' [2] not equal to array extent

 

I´m using R 2.7.2 on Win XP and the latest version of ipred.

 

Thanks a lot.

TIM

 

--- 

Dipl.-Geogr. Tim Häring

Sachgebiet Standort und Bodenschutz (SG 2.1)

Bayerische Landesanstalt für Wald und Forstwirtschaft

Am Hochanger 11

D-85354 Freising



Tel.: +49-(0)8161/71-4769

E-Mail: tim.haer...@lwf.bayern.de

http://www.lwf.bayern.de






[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] problems with extractPrediction in package caret

2009-01-15 Thread Häring, Tim (LWF)

Hi list,

I´m working on a predictive modeling task using the caret package.
I found the best model parameters using the train() and trainControl() command. 
Now I want to evaluate my model and make predictions on a test dataset. I tried 
to follow the instructions in the manual and the vignettes but unfortunately 
I´m getting an error message I can`t figure out.
Here is my code:
rfControl <- trainControl(method = "oob", returnResamp = "all", 
returnData=TRUE, verboseIter = TRUE)
rftrain <- train(x=train_x, y=trainclass, method="rf", tuneGrid=tuneGrid, 
tr.control=rfControl)

pred <- predict(rftrain) 
pred# this works fine
expred <- extractPrediction(rftrain)

Error in models[[1]]$trainingData : 
  $ operator is invalid for atomic vectors

My predictors are 28 numeric attributes and one factor.
I`m working with the latest version of caret and R 2.7.2 on WinXP.

Any advice is very welcome.

Thanks.
TIM


--- 
Dipl.-Geogr. Tim Häring
Sachgebiet Standort und Bodenschutz (SG 2.1)
Bayerische Landesanstalt für Wald und Forstwirtschaft
Am Hochanger 11
D-85354 Freising

Tel.: +49-(0)8161/71-4769
E-Mail: tim.haer...@lwf.bayern.de
http://www.lwf.bayern.de




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Question about the RWEKA package

2009-01-07 Thread Häring, Tim (LWF)

Dear List,

I´m trying to implement the functionalities from WEKA into my modeling project 
in R through the RWeka package.
In this context I have a slightly special question about the filters 
implemented in WEKA.
I want to convert nominal attributes with k values into k binary attributes 
through the NominalToBinary filter 
("weka.filters.supervised.attribute.NominalToBinary"). But unfortunately I 
can`t apply the filter to my data.
Here is my code:

nombi <- make_Weka_filter("weka/filters/supervised/attribute/NominalToBinary")
x2bin <- nombi(data=dat, control =Weka_control(N=TRUE, A=TRUE))

I didn't get an error message, but it still don't work. My nominal attribute is 
of class "factor".
Maybe the problem has to do with the argument list.
Argument list:
  (formula, data, subset, na.action, control = NULL)
What is meant with the argument "formula"?

Any advice? I`d be glad for any hint!
I`m using R 2.7.2 and RWEKA 0.3-14

TIM
--- 
Dipl.-Geogr. Tim Häring
Sachgebiet Standort und Bodenschutz (SG 2.1)
Bayerische Landesanstalt für Wald und Forstwirtschaft
Am Hochanger 11
D-85354 Freising

E-Mail: tim.haer...@lwf.bayern.de
http://www.lwf.bayern.de




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] problem with predict(mboost,...)

Re: [R] nu-SVM crashes in e1071

Re: [R] nu-SVM crashes in e1071

[R] nu-SVM crashes in e1071

[R] different randomForest performance for same data

[R] different model performance because of nlevels()???

[R] creating lists in a list with loop

[R] feature weighting in randomForest

[R] "prob" in predict(randomForest)

[R] gbm for multi-class problems

Re: [R] dimnames in pkg "ipred"

Re: [R] dimnames in pkg "ipred"

[R] dimnames in pkg "ipred"

[R] problems with extractPrediction in package caret

[R] Question about the RWEKA package

15 matches

Site Navigation

Mail list logo

Footer information