subject:"\[mllib\] GradientDescent requires huge memory for storing weight vector"

Re: [mllib] GradientDescent requires huge memory for storing weight vector

2015-01-12 Thread Reza Zadeh

I guess you're not using too many features (e.g. 10m), just that hashing the index makes it look that way, is that correct? If so, the simple dictionary that maps your feature index - rank can be broadcast and used everywhere, so you can pass mllib just the feature's rank as its index. Reza On

[mllib] GradientDescent requires huge memory for storing weight vector

2015-01-12 Thread Tianshuo Deng

Hi, Currently in GradientDescent.scala, weights is constructed as a dense vector: initialWeights = Vectors.dense(new Array[Double](numFeatures)) And the numFeatures is determined in the loadLibSVMFile as the max index of features. But in the case of using hash function to compute feature