1)  Norm(weights, N) will return (w_1^N + w_2^N +....)^(1/N), so norm
* norm is required.

2) This is bug as you said. I intend to fix this using weighted
regularization, and intercept term will be regularized with weight
zero. https://github.com/apache/spark/pull/1518 But I never actually
have time to finish it. In the meantime, I'm fixing this without this
framework in new ML pipeline framework.

3) I think in the long term, we need weighted regularizer instead of
updater which couples regularization and adaptive step size update for
GD which is not needed in other optimization package.

Sincerely,

DB Tsai
-------------------------------------------------------
Blog: https://www.dbtsai.com


On Tue, Apr 7, 2015 at 3:03 PM, Ulanov, Alexander
<alexander.ula...@hp.com> wrote:
> Hi,
>
> Could anyone elaborate on the regularization in Spark? I've found that L1 and 
> L2 are implemented with Updaters (L1Updater, SquaredL2Updater).
> 1)Why the loss reported by L2 is (0.5 * regParam * norm * norm) where norm is 
> Norm(weights, 2.0)? It should be 0.5*regParam*norm (0.5 to disappear after 
> differentiation). It seems that it is mixed up with mean squared error.
> 2)Why all weights are regularized? I think we should leave the bias weights 
> (aka free or intercept) untouched if we don't assume that the data is 
> centralized.
> 3)Are there any short-term plans to move regularization from updater to a 
> more convenient place?
>
> Best regards, Alexander
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Reply via email to