Re: MLLIb: Linear regression: Loss was due to java.lang.ArrayIndexOutOfBoundsException

2014-12-15 Thread Xiangrui Meng
Is it possible that after filtering the feature dimension changed? This may happen if you use LIBSVM format but didn't specify the number of features. -Xiangrui On Tue, Dec 9, 2014 at 4:54 AM, Sameer Tilak ssti...@live.com wrote: Hi All, I was able to run LinearRegressionwithSGD for a largeer

MLLIb: Linear regression: Loss was due to java.lang.ArrayIndexOutOfBoundsException

2014-12-08 Thread Sameer Tilak
Hi All, I was able to run LinearRegressionwithSGD for a largeer dataset ( 2GB sparse). I have now filtered the data and I am running regression on a subset of it (~ 200 MB). I see this error, which is strange since it was running fine with the superset data. Is this a formatting issue

RE: MLLib Linear regression

2014-10-08 Thread Sameer Tilak
2014 15:11:39 -0700 Subject: Re: MLLib Linear regression From: men...@gmail.com To: ssti...@live.com CC: user@spark.apache.org Did you test different regularization parameters and step sizes? In the combination that works, I don't see A + D. Did you test that combination? Are there any linear

Re: MLLib Linear regression

2014-10-08 Thread Xiangrui Meng
Subject: Re: MLLib Linear regression From: men...@gmail.com To: ssti...@live.com CC: user@spark.apache.org Did you test different regularization parameters and step sizes? In the combination that works, I don't see A + D. Did you test that combination? Are there any linear dependency between

MLLib Linear regression

2014-10-07 Thread Sameer Tilak
Hi All,I have following classes of features: class A: 15000 featuresclass B: 170 featuresclass C: 900 featuresClass D: 6000 features. I use linear regression (over sparse data). I get excellent results with low RMSE (~0.06) for the following combinations of classes:1. A + B + C 2. B + C + D3.

RE: MLLib Linear regression

2014-10-07 Thread Sameer Tilak
...@live.com To: user@spark.apache.org Subject: MLLib Linear regression Date: Tue, 7 Oct 2014 13:41:03 -0700 Hi All,I have following classes of features: class A: 15000 featuresclass B: 170 featuresclass C: 900 featuresClass D: 6000 features. I use linear regression (over sparse data). I get excellent

Re: MLLib Linear regression

2014-10-07 Thread Xiangrui Meng
. From: ssti...@live.com To: user@spark.apache.org Subject: MLLib Linear regression Date: Tue, 7 Oct 2014 13:41:03 -0700 Hi All, I have following classes of features: class A: 15000 features class B: 170 features class C: 900 features Class D: 6000 features. I use

Re: MLlib Linear Regression Mismatch

2014-10-01 Thread Burak Yavuz
, 2014 12:43:20 PM Subject: MLlib Linear Regression Mismatch Guys, Obviously I am doing something wrong. May be 4 points are too small a dataset. Can you help me to figure out why the following doesn't work ? a) This works : data = [ LabeledPoint(0.0, [0.0]), LabeledPoint(10.0, [10.0

Re: MLlib Linear Regression Mismatch

2014-10-01 Thread Krishna Sankar
to be 0.1 or 0.01? Best, Burak - Original Message - From: Krishna Sankar ksanka...@gmail.com To: user@spark.apache.org Sent: Wednesday, October 1, 2014 12:43:20 PM Subject: MLlib Linear Regression Mismatch Guys, Obviously I am doing something wrong. May be 4 points are too small