You math is correct.

When you say you have 105 features, what do you mean?  Are these textual
features?  Or what?

On Tue, Jan 3, 2012 at 2:53 PM, Grant Ingersoll <gsing...@apache.org> wrote:

> I'm trying to run the full ASF email SGD classifier problem and am facing
> heap size issues.  My current setup has 105 features and I am using a
> cardinality of 100K.  I'm using the AdaptiveLogisticRegression.  I'm
> getting heap errors and they occur when trying to construct the ALR class
> (i.e. not later during training).
>
> Just trying to check my math on memory:
> ALR comes with 20 CrossFoldLearners (CFL) and each of those comes with 5
> OnlineLogisticRegression instances, which each have a DenseMatrix of
> (numFeatures -1) X cardinality, plus some other vectors.
>
> This means, in my case, I have:
> 20 x 5 x (104 x 100,000 x sizeof(double)) = 332,800,000,000 bits = ~39 GB
>
> Am I understanding the major parts of memory for ALR correctly?  In other
> words, I need to tone down the number of CFLs in the TrainASFEmail.java
> file so as to not use 20 CFLs, right?

Reply via email to