[ https://issues.apache.org/jira/browse/SPARK-26173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dongjoon Hyun updated SPARK-26173: ---------------------------------- Affects Version/s: (was: 2.4.0) 3.0.0 > Prior regularization for Logistic Regression > -------------------------------------------- > > Key: SPARK-26173 > URL: https://issues.apache.org/jira/browse/SPARK-26173 > Project: Spark > Issue Type: New Feature > Components: MLlib > Affects Versions: 3.0.0 > Reporter: Facundo Bellosi > Priority: Minor > Attachments: Prior regularization.png > > > This feature enables Maximum A Posteriori (MAP) optimization for Logistic > Regression based on a Gaussian prior. In practice, this is just implementing > a more general form of L2 regularization parameterized by a (multivariate) > mean and precisions (inverse of variance) vectors. > Prior regularization is calculated through the following formula: > !Prior regularization.png! > where: > * λ: regularization parameter ({{regParam}}) > * K: number of coefficients (weights vector length) > * w~i~ with prior Normal(μ~i~, β~i~^2^) > _Reference: Bishop, Christopher M. (2006). Pattern Recognition and Machine > Learning (section 4.5). Berlin, Heidelberg: Springer-Verlag._ > h3. Existing implementations > * Python: [bayes_logistic|https://pypi.org/project/bayes_logistic/] > h2. Implementation > * 2 new parameters added to {{LogisticRegression}}: {{priorMean}} and > {{priorPrecisions}}. > * 1 new class ({{PriorRegularization}}) implements the calculations of the > value and gradient of the prior regularization term. > * Prior regularization is enabled when both vectors are provided and > {{regParam}} > 0 and {{elasticNetParam}} < 1. > h2. Tests > * {{DifferentiableRegularizationSuite}} > ** {{Prior regularization}} > * {{LogisticRegressionSuite}} > ** {{prior precisions should be required when prior mean is set}} > ** {{prior mean should be required when prior precisions is set}} > ** {{`regParam` should be positive when using prior regularization}} > ** {{`elasticNetParam` should be less than 1.0 when using prior > regularization}} > ** {{prior mean and precisions should have equal length}} > ** {{priors' length should match number of features}} > ** {{binary logistic regression with prior regularization equivalent to L2}} > ** {{binary logistic regression with prior regularization equivalent to L2 > (bis)}} > ** {{binary logistic regression with prior regularization}} -- This message was sent by Atlassian JIRA (v7.6.14#76016) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org