Andrew,

That did the trick.

Thank you.
Dan

From: Andrew Robinson [mailto:mensuration...@gmail.com]
Sent: Monday, January 14, 2013 6:06 PM
To: Lopez, Dan
Cc: R help (r-help@r-project.org)
Subject: Re: [R] Random Forest Error for Factor to Character column

After you subset the data, did you redeclare the factor? If not then R still 
thinks it has the potential for all those levels.

TRAINSET$JOBTITLE <- factor(TRAINSET$JOBTITLE)

I hope this helps

Andrew

On Tuesday, January 15, 2013, Lopez, Dan wrote:
Hi,

Can someone please offer me some guidance?

I imported some data. One of the columns called "JOBTITLE" when imported was 
imported as a factor column with 416 levels.

I subset the data in such a way that only 4 levels have data in "JOBTITLE" and 
tried running randomForest but it complained about "JOBTITLE" having more than 
32 categories. I know that is the limit in randomForest but I guess I don't 
understand enough about factors because I thought by subsetting the data this 
no longer would be an issue. BTW I can run randomForest on this dataset if I 
exclude "JOBTITLE".

So  I then converted that column to a character vector:
> TRAINSET$JOBTITLE<-as.character(TRAINSET$JOBTITLE)

I ran Random Forest and got the below error. Why isn't this working? What do I 
need to do to get this working?

> library(randomForest)
> FOREST_model <- randomForest(as.factor(TARGET)~., data=trainset, mtry=4, 
> ntree=1000,
+                            importance=TRUE, do.trace=100)

Error in randomForest.default(m, y, ...) :
  NA/NaN/Inf in foreign function call (arg 1)
In addition: Warning message:
In data.matrix(x) : NAs introduced by coercion

Your help will be greatly appreciated.

Dan

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org<javascript:;> mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Andrew Robinson
Director (A/g), ACERA
Department of Mathematics and Statistics            Tel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia               (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr<http://www.ms.unimelb.edu.au/%7Eandrewpr>
              Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

FAwR: 
http://www.ms.unimelb.edu.au/~andrewpr/FAwR/<http://www.ms.unimelb.edu.au/%7Eandrewpr/FAwR/>
SPuR: http://www.ms.unimelb.edu.au/spuRs/

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to