Re: [Scikit-learn-general] memory use of sklearn GBM implementation

2015-10-09 Thread Peter Rickwood
OK, I will write a minimal program to reproduce the issue, and submit as a bug report. Wont be able to do this for a couple of days as I'm AFK but I will do it then. Cheers, Peter Date: Fri, 9 Oct 2015 10:39:10 -0400 > From: Andreas Mueller > > On 10/08/2015 08:40 PM, Peter

Re: [Scikit-learn-general] memory use of sklearn GBM implementation

2015-10-08 Thread Peter Rickwood
Found the issue It is because I am using warm start. I was using warm start and gradually adding models to the GBM, and this causes a memory blowout. If I change this and just run the same number of iterations in one go rather than incrementally, I get no memory issue. Peter -

Re: [Scikit-learn-general] memory use of sklearn GBM implementation

2015-10-08 Thread Peter Rickwood
Jacob: Great, thanks for confirming. Glad I'm not going crazy or doing something silly. What sklearn version would I need to downgrade to to get back to the old setup (one splitter for all trees)? Andreas: yes, it completes just fine if I set the number of iterations low enough (i.e. ~80) Thanks

Re: [Scikit-learn-general] memory use of sklearn GBM implementation

2015-10-08 Thread Peter Rickwood
; So if you use less than 100 trees it runs through? > > Andy > > > On 10/08/2015 06:12 PM, Peter Rickwood wrote: > > > > > > Hello all, > > > > I'm puzzled by the memory use of sklearns GBM implementation. It takes > > up all available me

Re: [Scikit-learn-general] Scikit-learn-general Digest, Vol 69, Issue 10

2015-10-08 Thread Peter Rickwood
ofiler > > So if you use less than 100 trees it runs through? > > Andy > > > On 10/08/2015 06:12 PM, Peter Rickwood wrote: > > > > > > Hello all, > > > > I'm puzzled by the memory use of sklearns GBM implementation. It takes > > up all av

[Scikit-learn-general] memory use of sklearn GBM implementation

2015-10-08 Thread Peter Rickwood
Hello all, I'm puzzled by the memory use of sklearns GBM implementation. It takes up all available memory and is forced to terminate by the OS, and I cant think of why it is using as much memory as it does. Here is the siituation: I have modest data set of size ~ 4GB (1800 columns, 55 rows,

[Scikit-learn-general] Possible code contribution (Poisson loss)

2015-07-23 Thread Peter Rickwood
Hello sklearn developers, I'd like the GBM implementation in sklearn to support Poisson loss, and I'm comfortable in writing the code (I have modified my local sklearn source already and am using Poisson loss GBM's). The sklearn site says to get in touch via this list before making a contribution