Janardhan created SYSTEMML-2018: ----------------------------------- Summary: FIXING WEIGHT DECAY REGULARIZATION IN ADAM Key: SYSTEMML-2018 URL: https://issues.apache.org/jira/browse/SYSTEMML-2018 Project: SystemML Issue Type: Improvement Components: Algorithms Reporter: Janardhan
The common implementations of adaptive gradient algorithms, such as Adam, limit the potential benefit of weight decay regularization, because the weights do not decay multiplicatively (as would be expected for standard weight decay) but by an additive constant factor. This following paper found a way to fix regularization in Adam Optimization with one addition step(+ wx) to the gradient step : https://arxiv.org/pdf/1711.05101.pdf -- This message was sent by Atlassian JIRA (v6.4.14#64029)