As Robin suggested, you may try the following new implementation.

https://github.com/apache/spark/commit/6a827d5d1ec520f129e42c3818fe7d0d870dcbef

Thanks.

Sincerely,

DB Tsai
----------------------------------------------------------
Blog: https://www.dbtsai.com
PGP Key ID: 0xAF08DF8D
<https://pgp.mit.edu/pks/lookup?search=0x59DF55B8AF08DF8D>

On Tue, Jun 9, 2015 at 3:22 PM, Robin East <robin.e...@xense.co.uk> wrote:

> Hi Stephen
>
> How many is a very large number of iterations? SGD is notorious for
> requiring 100s or 1000s of iterations, also you may need to spend some time
> tweaking the step-size. In 1.4 there is an implementation of ElasticNet
> Linear Regression which is supposed to compare favourably with an
> equivalent R implementation.
> > On 9 Jun 2015, at 22:05, Stephen Carman <scar...@coldlight.com> wrote:
> >
> > Hi User group,
> >
> > We are using spark Linear Regression with SGD as the optimization
> technique and we are achieving very sub-optimal results.
> >
> > Can anyone shed some light on why this implementation seems to produce
> such poor results vs our own implementation?
> >
> > We are using a very small dataset, but we have to use a very large
> number of iterations to achieve similar results to our implementation,
> we’ve tried normalizing the data
> > not normalizing the data and tuning every param. Our implementation is a
> closed form solution so we should be guaranteed convergence but the spark
> one is not, which is
> > understandable, but why is it so far off?
> >
> > Has anyone experienced this?
> >
> > Steve Carman, M.S.
> > Artificial Intelligence Engineer
> > Coldlight-PTC
> > scar...@coldlight.com
> > This e-mail is intended solely for the above-mentioned recipient and it
> may contain confidential or privileged information. If you have received it
> in error, please notify us immediately and delete the e-mail. You must not
> copy, distribute, disclose or take any action in reliance on it. In
> addition, the contents of an attachment to this e-mail may contain software
> viruses which could damage your own computer system. While ColdLight
> Solutions, LLC has taken every reasonable precaution to minimize this risk,
> we cannot accept liability for any damage which you sustain as a result of
> software viruses. You should perform your own virus checks before opening
> the attachment.
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> > For additional commands, e-mail: user-h...@spark.apache.org
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Reply via email to