subject:"\[MLlib\] Performance issues when building GBM models"

RE: [MLlib] Performance issues when building GBM models

2015-02-09 Thread Christopher Thom

angrui Meng [mailto:men...@gmail.com] Sent: Tuesday, 10 February 2015 7:07 AM To: Christopher Thom Cc: user@spark.apache.org Subject: Re: [MLlib] Performance issues when building GBM models Could you check the Spark UI and see whether there are RDDs being kicked out during the computation? We cache the

Re: [MLlib] Performance issues when building GBM models

2015-02-09 Thread Xiangrui Meng

Could you check the Spark UI and see whether there are RDDs being kicked out during the computation? We cache the residual RDD after each iteration. If we don't have enough memory/disk, it gets recomputed and results something like `t(n) = t(n-1) + const`. We might cache the features multiple times

[MLlib] Performance issues when building GBM models

2015-02-08 Thread Christopher Thom

Hi All, I wonder if anyone else has some experience building a Gradient Boosted Trees model using spark/mllib? I have noticed when building decent-size models that the process slows down over time. We observe that the time to build tree n is approximately a constant time longer than the time to

RE: [MLlib] Performance issues when building GBM models

Re: [MLlib] Performance issues when building GBM models

[MLlib] Performance issues when building GBM models

3 matches

Site Navigation

Mail list logo

Footer information