Re: MLLib - Thoughts about refactoring Updater for LBFGS?

David Hall Thu, 06 Mar 2014 16:27:11 -0800

On Thu, Mar 6, 2014 at 4:21 PM, DB Tsai <dbt...@alpinenow.com> wrote:


> Hi David,
>
> I can converge to the same result with your breeze LBFGS and Fortran
> implementations now. Probably, I made some mistakes when I tried
> breeze before. I apologize that I claimed it's not stable.
>
> See the test case in BreezeLBFGSSuite.scala
> https://github.com/AlpineNow/spark/tree/dbtsai-breezeLBFGS
>
> This is training multinomial logistic regression against iris dataset,
> and both optimizers can train the models with 98% training accuracy.
>

great to hear! There were some bugs in LBFGS about 6 months ago, so
depending on the last time you tried it, it might indeed have been bugged.


>
> There are two issues to use Breeze in Spark,
>
> 1) When the gradientSum and lossSum are computed distributively in
> custom defined DiffFunction which will be passed into your optimizer,
> Spark will complain LBFGS class is not serializable. In
> BreezeLBFGS.scala, I've to convert RDD to array to make it work
> locally. It should be easy to fix by just having LBFGS to implement
> Serializable.
>

I'm not sure why Spark should be serializing LBFGS? Shouldn't it live on
the controller node? Or is this a per-node thing?

But no problem to make it serializable.


>
> 2) Breeze computes redundant gradient and loss. See the following log
> from both Fortran and Breeze implementations.
>

Err, yeah. I should probably have LBFGS do this automatically, but there's
a CachedDiffFunction that gets rid of the redundant calculations.

-- David


>
> Thanks.
>
> Fortran:
> Iteration -1: loss 1.3862943611198926, diff 1.0
> Iteration 0: loss 1.5846343143210866, diff 0.14307193024217352
> Iteration 1: loss 1.1242501524477688, diff 0.29053004039012126
> Iteration 2: loss 1.0930151243303563, diff 0.027782962952189336
> Iteration 3: loss 1.054036932835569, diff 0.03566113127440601
> Iteration 4: loss 0.9907956302751622, diff 0.05999907649459571
> Iteration 5: loss 0.9184205380342829, diff 0.07304737423337761
> Iteration 6: loss 0.8259870936519937, diff 0.10064381175132982
> Iteration 7: loss 0.6327447552109574, diff 0.23395293458364716
> Iteration 8: loss 0.5534101162436359, diff 0.1253815427665277
> Iteration 9: loss 0.4045020086612566, diff 0.26907321376758075
> Iteration 10: loss 0.3078824990823728, diff 0.23885980452569627
>
> Breeze:
> Iteration -1: loss 1.3862943611198926, diff 1.0
> Mar 6, 2014 3:59:11 PM com.github.fommil.netlib.BLAS <clinit>
> WARNING: Failed to load implementation from:
> com.github.fommil.netlib.NativeSystemBLAS
> Mar 6, 2014 3:59:11 PM com.github.fommil.netlib.BLAS <clinit>
> WARNING: Failed to load implementation from:
> com.github.fommil.netlib.NativeRefBLAS
> Iteration 0: loss 1.3862943611198926, diff 0.0
> Iteration 1: loss 1.5846343143210866, diff 0.14307193024217352
> Iteration 2: loss 1.1242501524477688, diff 0.29053004039012126
> Iteration 3: loss 1.1242501524477688, diff 0.0
> Iteration 4: loss 1.1242501524477688, diff 0.0
> Iteration 5: loss 1.0930151243303563, diff 0.027782962952189336
> Iteration 6: loss 1.0930151243303563, diff 0.0
> Iteration 7: loss 1.0930151243303563, diff 0.0
> Iteration 8: loss 1.054036932835569, diff 0.03566113127440601
> Iteration 9: loss 1.054036932835569, diff 0.0
> Iteration 10: loss 1.054036932835569, diff 0.0
> Iteration 11: loss 0.9907956302751622, diff 0.05999907649459571
> Iteration 12: loss 0.9907956302751622, diff 0.0
> Iteration 13: loss 0.9907956302751622, diff 0.0
> Iteration 14: loss 0.9184205380342829, diff 0.07304737423337761
> Iteration 15: loss 0.9184205380342829, diff 0.0
> Iteration 16: loss 0.9184205380342829, diff 0.0
> Iteration 17: loss 0.8259870936519939, diff 0.1006438117513297
> Iteration 18: loss 0.8259870936519939, diff 0.0
> Iteration 19: loss 0.8259870936519939, diff 0.0
> Iteration 20: loss 0.6327447552109576, diff 0.233952934583647
> Iteration 21: loss 0.6327447552109576, diff 0.0
> Iteration 22: loss 0.6327447552109576, diff 0.0
> Iteration 23: loss 0.5534101162436362, diff 0.12538154276652747
> Iteration 24: loss 0.5534101162436362, diff 0.0
> Iteration 25: loss 0.5534101162436362, diff 0.0
> Iteration 26: loss 0.40450200866125635, diff 0.2690732137675816
> Iteration 27: loss 0.40450200866125635, diff 0.0
> Iteration 28: loss 0.40450200866125635, diff 0.0
> Iteration 29: loss 0.30788249908237314, diff 0.23885980452569502
>
> Sincerely,
>
> DB Tsai
> Machine Learning Engineer
> Alpine Data Labs
> --------------------------------------
> Web: http://alpinenow.com/
>
>
> On Wed, Mar 5, 2014 at 2:00 PM, David Hall <d...@cs.berkeley.edu> wrote:
> > On Wed, Mar 5, 2014 at 1:57 PM, DB Tsai <dbt...@alpinenow.com> wrote:
> >
> >> Hi David,
> >>
> >> On Tue, Mar 4, 2014 at 8:13 PM, dlwh <david.lw.h...@gmail.com> wrote:
> >> > I'm happy to help fix any problems. I've verified at points that the
> >> > implementation gives the exact same sequence of iterates for a few
> >> different
> >> > functions (with a particular line search) as the c port of lbfgs. So
> I'm
> >> a
> >> > little surprised it fails where Fortran succeeds... but only a little.
> >> This
> >> > was fixed late last year.
> >> I'm working on a reproducible test case using breeze vs fortran
> >> implementation to show the problem I've run into. The test will be in
> >> one of the test cases in my Spark fork, is it okay for you to
> >> investigate the issue? Or do I need to make it as a standalone test?
> >>
> >
> >
> > Um, as long as it wouldn't be too hard to pull out.
> >
> >
> >>
> >> Will send you the test later today.
> >>
> >> Thanks.
> >>
> >> Sincerely,
> >>
> >> DB Tsai
> >> Machine Learning Engineer
> >> Alpine Data Labs
> >> --------------------------------------
> >> Web: http://alpinenow.com/
> >>
>

Re: MLLib - Thoughts about refactoring Updater for LBFGS?

Reply via email to