RE: MLLib decision tree: Weights

Sameer Tilak Wed, 03 Sep 2014 10:22:47 -0700

Dear Xiangrui,
Thanks for your reply. We will use sampling for now. However, just to let you 
know, we believe that it is not the best fit for our problems due to two 
reasons (1) high dimensionality of data (600) features and (2) Highly skewed 
distribution.


Do you have any idea when MLLib v1.2 will be released? We can plan things 
accordingly.
> Date: Tue, 2 Sep 2014 23:15:09 -0700
> Subject: Re: MLLib decision tree: Weights
> From: men...@gmail.com
> To: ssti...@live.com
> CC: user@spark.apache.org
> 
> This is not supported in MLlib. Hopefully, we will add support for
> weighted examples in v1.2. If you want to train weighted instances
> with the current tree implementation, please try importance sampling
> first to adjust the weights. For instance, an example with weight 0.3
> is sampled with probability 0.3. And if it is sampled, its weight
> become 1. -Xiangrui
> 
> On Tue, Sep 2, 2014 at 1:05 PM, Sameer Tilak <ssti...@live.com> wrote:
> > Hi everyone,
> >
> >
> > We are looking to apply a weight to each training example; this weight
> > should be used when computing the penalty of a misclassified example.  For
> > instance, without weighting, each example is penalized 1 point when
> > evaluating the model of a classifier, such as a decision tree.  We would
> > like to customize this penalty for each training example, such that we could
> > apply a penalty of W for a misclassified example, where W is a weight
> > associated with the given training example.
> >
> >
> > Is this something that is supported directly in MLLib? I would appreciate if
> > someone can point me in right direction.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>

RE: MLLib decision tree: Weights

Reply via email to