RE: Regularization in MLlib

Ulanov, Alexander Tue, 07 Apr 2015 16:06:13 -0700

Hi DB,

Thank you!

In general case (not only for regression), I think that Regularizer should be 
tightly coupled with Gradient otherwise it will have no idea which weights are 
bias (intercept).

Best regards, Alexander

-----Original Message-----
From: DB Tsai [mailto:dbt...@dbtsai.com] 
Sent: Tuesday, April 07, 2015 3:28 PM
To: Ulanov, Alexander
Cc: dev@spark.apache.org
Subject: Re: Regularization in MLlib

1)  Norm(weights, N) will return (w_1^N + w_2^N +....)^(1/N), so norm
* norm is required.

2) This is bug as you said. I intend to fix this using weighted regularization, 
and intercept term will be regularized with weight zero. 
https://github.com/apache/spark/pull/1518 But I never actually have time to 
finish it. In the meantime, I'm fixing this without this framework in new ML 
pipeline framework.

3) I think in the long term, we need weighted regularizer instead of updater 
which couples regularization and adaptive step size update for GD which is not 
needed in other optimization package.

Sincerely,

DB Tsai
-------------------------------------------------------
Blog: https://www.dbtsai.com

On Tue, Apr 7, 2015 at 3:03 PM, Ulanov, Alexander <alexander.ula...@hp.com> 
wrote:
> Hi,
>
> Could anyone elaborate on the regularization in Spark? I've found that L1 and 
> L2 are implemented with Updaters (L1Updater, SquaredL2Updater).
> 1)Why the loss reported by L2 is (0.5 * regParam * norm * norm) where norm is 
> Norm(weights, 2.0)? It should be 0.5*regParam*norm (0.5 to disappear after 
> differentiation). It seems that it is mixed up with mean squared error.
> 2)Why all weights are regularized? I think we should leave the bias weights 
> (aka free or intercept) untouched if we don't assume that the data is 
> centralized.
> 3)Are there any short-term plans to move regularization from updater to a 
> more convenient place?
>
> Best regards, Alexander
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

RE: Regularization in MLlib

Reply via email to