The proper step size partially depends on the Lipschitz constant of
the objective. You should let the machine try different combinations
of parameters and select the best. We are working with people from
AMPLab to make hyperparameter tunning easier in MLlib 1.2. For the
theory, Nesterov's book "Introductory Lectures on Convex Optimization"
is a good one.

We didn't use line search in the current implementation of
LinearRegression, which we should definitely add that option in the
future.

Best,
Xiangrui

On Wed, Oct 8, 2014 at 7:21 AM, Sameer Tilak <ssti...@live.com> wrote:
> Hi Xiangrui,
> Changing the default step size to 0.01 made a huge difference. The results
> make sense when I use A + B + C + D. MSE is ~0.07 and the outcome matches
> the domain knowledge.
>
> I was wondering is there any documentation on the parameters and when/how to
> vary them.
>
>> Date: Tue, 7 Oct 2014 15:11:39 -0700
>> Subject: Re: MLLib Linear regression
>> From: men...@gmail.com
>> To: ssti...@live.com
>> CC: user@spark.apache.org
>
>>
>> Did you test different regularization parameters and step sizes? In
>> the combination that works, I don't see "A + D". Did you test that
>> combination? Are there any linear dependency between A's columns and
>> D's columns? -Xiangrui
>>
>> On Tue, Oct 7, 2014 at 1:56 PM, Sameer Tilak <ssti...@live.com> wrote:
>> > BTW, one detail:
>> >
>> > When number of iterations is 100 all weights are zero or below and the
>> > indices are only from set A.
>> >
>> > When number of iterations is 150 I see 30+ non-zero weights (when sorted
>> > by
>> > weight) and indices are distributed across al sets. however MSE is high
>> > (5.xxx) and the result does not match the domain knowledge.
>> >
>> > When number of iterations is 400 I see 30+ non-zero weights (when sorted
>> > by
>> > weight) and indices are distributed across al sets. however MSE is high
>> > (6.xxx) and the result does not match the domain knowledge.
>> >
>> > Any help will be highly appreciated.
>> >
>> >
>> > ________________________________
>> > From: ssti...@live.com
>> > To: user@spark.apache.org
>> > Subject: MLLib Linear regression
>> > Date: Tue, 7 Oct 2014 13:41:03 -0700
>> >
>> >
>> > Hi All,
>> > I have following classes of features:
>> >
>> > class A: 15000 features
>> > class B: 170 features
>> > class C: 900 features
>> > Class D: 6000 features.
>> >
>> > I use linear regression (over sparse data). I get excellent results with
>> > low
>> > RMSE (~0.06) for the following combinations of classes:
>> > 1. A + B + C
>> > 2. B + C + D
>> > 3. A + B
>> > 4. A + C
>> > 5. B + D
>> > 6. C + D
>> > 7. D
>> >
>> > Unfortunately, when I use A + B + C + D (all the features) I get results
>> > that don't make any sense -- all weights are zero or below and the
>> > indices
>> > are only from set A. I also get high MSE. I changed the number of
>> > iterations
>> > from 100 to 150, 250, or even 400. I still get MSE as (5/ 6). Are there
>> > any
>> > other parameters that I can play with? Any insight on what could be
>> > wrong?
>> > Is it somehow it is not able to scale up to 22K features? (I highly
>> > doubt
>> > that).
>> >
>> >
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to