Hi, I looked into the LinearSVC flow and found the gradient for hinge as follows:
Our loss function with {0, 1} labels is max(0, 1 - (2y - 1) (f_w(x))) Therefore the gradient is -(2y - 1)*x max is a non-smooth function. Did we try using ReLu/Softmax function and use that to smooth the hinge loss ? Loss function will change to SoftMax(0, 1 - (2y-1) (f_w(x))) Since this function is smooth, gradient will be well defined and LBFGS/OWLQN should behave well. Please let me know if this has been tried already. If not I can run some benchmarks. We have soft-max in multinomial regression and can be reused for LinearSVC flow. Thanks. Deb