SGD and memory

2012-01-03 Thread Grant Ingersoll
I'm trying to run the full ASF email SGD classifier problem and am facing heap size issues. My current setup has 105 features and I am using a cardinality of 100K. I'm using the AdaptiveLogisticRegression. I'm getting heap errors and they occur when trying to construct the ALR class (i.e. not

Re: SGD and memory

2012-01-03 Thread Ted Dunning
You math is correct. When you say you have 105 features, what do you mean? Are these textual features? Or what? On Tue, Jan 3, 2012 at 2:53 PM, Grant Ingersoll wrote: > I'm trying to run the full ASF email SGD classifier problem and am facing > heap size issues. My current setup has 105 feat

Re: SGD and memory

2012-01-03 Thread Lance Norskog
Does these algorithms have good locality? For doing giant online computations it might be worth storing these in memory-mapped files. Or, give up and get the M/R SGD code in. On Tue, Jan 3, 2012 at 2:59 PM, Ted Dunning wrote: > You math is correct. > > When you say you have 105 features, what do

Re: SGD and memory

2012-01-03 Thread Ted Dunning
No. They don't have particularly good locality. The would have moderate hotspots, but these would be scatter all over. The hotspots might allow L2 cache to help, but would not allow disk based data to work. The major opportunity for improvement here is to incorporate some of the advances that V

Re: SGD and memory

2012-01-03 Thread Grant Ingersoll
On Jan 3, 2012, at 5:59 PM, Ted Dunning wrote: > You math is correct. > > When you say you have 105 features, what do you mean? Sorry, that should have been 105 categories/labels. I'm trying to do the ASF email equivalent of 20 news groups, but in this case it's 105 ASF projects. The basic

Re: SGD and memory

2012-01-03 Thread Ted Dunning
Ahh... of course. I should have understood that from the multiplication you did since 104 = 105-1. On Tue, Jan 3, 2012 at 7:58 PM, Grant Ingersoll wrote: > > On Jan 3, 2012, at 5:59 PM, Ted Dunning wrote: > > > You math is correct. > > > > When you say you have 105 features, what do you mean? >