MLLib regression model weights

2014-09-18 Thread Sameer Tilak
Hi All, I am able to run LinearRegressionWithSGD on a small sample dataset (~60MB Libsvm file of sparse data) with 6700 features. val model = LinearRegressionWithSGD.train(examples, numIterations) At the end I get a model that model.weights.sizeres6: Int = 6699 I am assuming each entry in the

Re: MLLib regression model weights

2014-09-18 Thread Debasish Das
sc.parallelize(model.weights.toArray, blocks).top(k) will get that right ? For logistic you might want both positive and negative feature...so just pass it through a filter on abs and then pick top(k) On Thu, Sep 18, 2014 at 10:30 AM, Sameer Tilak ssti...@live.com wrote: Hi All, I am able to

Re: MLLib regression model weights

2014-09-18 Thread Xiangrui Meng
The importance should be based on some statistics, for example, the standard deviation of the feature column and the magnitude of the weight. If the columns are scaled to unit standard deviation (using StandardScaler), you can tell the importance by the absolute value of the weight. But there are