OK, I will write a minimal program to reproduce the issue, and submit as a
bug report. Wont be able to do this for a couple of days as I'm AFK but I
will do it then.
Cheers,
Peter
Date: Fri, 9 Oct 2015 10:39:10 -0400
> From: Andreas Mueller
>
> On 10/08/2015 08:40 PM, Peter
Found the issue
It is because I am using warm start. I was using warm start and gradually
adding models to the GBM, and this causes a memory blowout. If I change
this and just run the same number of iterations in one go rather than
incrementally, I get no memory issue.
Peter
-
Jacob: Great, thanks for confirming. Glad I'm not going crazy or doing
something silly.
What sklearn version would I need to downgrade to to get back to the old
setup (one splitter for all trees)?
Andreas: yes, it completes just fine if I set the number of iterations low
enough (i.e. ~80)
Thanks
; So if you use less than 100 trees it runs through?
>
> Andy
>
>
> On 10/08/2015 06:12 PM, Peter Rickwood wrote:
> >
> >
> > Hello all,
> >
> > I'm puzzled by the memory use of sklearns GBM implementation. It takes
> > up all available me
ofiler
>
> So if you use less than 100 trees it runs through?
>
> Andy
>
>
> On 10/08/2015 06:12 PM, Peter Rickwood wrote:
> >
> >
> > Hello all,
> >
> > I'm puzzled by the memory use of sklearns GBM implementation. It takes
> > up all av
Hello all,
I'm puzzled by the memory use of sklearns GBM implementation. It takes up
all available memory and is forced to terminate by the OS, and I cant think
of why it is using as much memory as it does.
Here is the siituation:
I have modest data set of size ~ 4GB (1800 columns, 55 rows,
Hello sklearn developers,
I'd like the GBM implementation in sklearn to support Poisson loss, and I'm
comfortable in writing the code (I have modified my local sklearn source
already and am using Poisson loss GBM's).
The sklearn site says to get in touch via this list before making a
contribution