That appears to work (with a small modification):
param_grid = [{'eta0':[1/alpha_this_step], 'alpha':[alpha_this_step]} for
alpha_this_step in 10.0**(np.arange(11)-5)]
Neat trick. Thanks.
Unfortunately I don't have time to do the experiment, but eta = 1/t and eta
= 1/(alpha*t) seem to be pretty
You can actually do that using the current grid-search.
Specify the "grid" as a list of single grid-points. That should do.
param_grid = [{'eta0':1/alpha_this_step, 'alpha':alpha_this_step} for
alpha_this_step in my_alphas]
That should do it, right?
I think for the "optimum" the same guarantee
Just noticed I got my Greeks wrong. By nu I meant eta everywhere.
On Thu, Sep 11, 2014 at 2:24 PM, F. William High wrote:
> The Shalev-Schwartz et al. Pegasos update rule on the learning rate
> parameter is
>
> nu_i = 1 / (lambda * i)
>
> where lambda multiplies the regularization term. If thi
The Shalev-Schwartz et al. Pegasos update rule on the learning rate
parameter is
nu_i = 1 / (lambda * i)
where lambda multiplies the regularization term. If this rule is used,
they show you can converge to an error of epsilon in O(1/(lambda*epsilon))
iterations at high probability. This differs