Re: [R] GLM: What is a good way for dealing with new factor levels in the test set?

Jim Lemon Wed, 29 Apr 2015 23:56:51 -0700

Hi thuksu,
Would defining the factor in your training set with all the levels
that occur in the test set solve the problem? That is, there would be
at least one factor level in the training set even though there were
no instances of that factor.


Jim


On Thu, Apr 30, 2015 at 8:05 AM, thuksu <t...@huksu.com> wrote:
> My training set and my test set have some factor levels that are
> different....  It's rare, but it occurs.
>
> What is a good way for dealing with this?
>
> I don't want to throw away the entire row from the data frame, because there
> is some valuable information in there.
>
> Is there some way to say something like "use the weighted average
> coefficient level for this factor"?
>
>
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/GLM-What-is-a-good-way-for-dealing-with-new-factor-levels-in-the-test-set-tp4706621.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] GLM: What is a good way for dealing with new factor levels in the test set?

Reply via email to