Re: MLLIb: Linear regression: Loss was due to java.lang.ArrayIndexOutOfBoundsException

2014-12-15 Thread Xiangrui Meng
Is it possible that after filtering the feature dimension changed? This may happen if you use LIBSVM format but didn't specify the number of features. -Xiangrui On Tue, Dec 9, 2014 at 4:54 AM, Sameer Tilak wrote: > Hi All, > > > I was able to run LinearRegressionwithSGD for a largeer dataset (> 2

MLLIb: Linear regression: Loss was due to java.lang.ArrayIndexOutOfBoundsException

2014-12-08 Thread Sameer Tilak
Hi All, I was able to run LinearRegressionwithSGD for a largeer dataset (> 2GB sparse). I have now filtered the data and I am running regression on a subset of it (~ 200 MB). I see this error, which is strange since it was running fine with the superset data. Is this a formatting issue

Re: MLLib Linear regression

2014-10-08 Thread Xiangrui Meng
> >> Date: Tue, 7 Oct 2014 15:11:39 -0700 >> Subject: Re: MLLib Linear regression >> From: men...@gmail.com >> To: ssti...@live.com >> CC: user@spark.apache.org > >> >> Did you test different regularization parameters and step sizes? In >> the comb

RE: MLLib Linear regression

2014-10-08 Thread Sameer Tilak
Oct 2014 15:11:39 -0700 > Subject: Re: MLLib Linear regression > From: men...@gmail.com > To: ssti...@live.com > CC: user@spark.apache.org > > Did you test different regularization parameters and step sizes? In > the combination that works, I don't see "A + D". Did

Re: MLLib Linear regression

2014-10-07 Thread Xiangrui Meng
the domain knowledge. > > Any help will be highly appreciated. > > > ____ > From: ssti...@live.com > To: user@spark.apache.org > Subject: MLLib Linear regression > Date: Tue, 7 Oct 2014 13:41:03 -0700 > > > Hi All, > I have following classes of fe

RE: MLLib Linear regression

2014-10-07 Thread Sameer Tilak
...@live.com To: user@spark.apache.org Subject: MLLib Linear regression Date: Tue, 7 Oct 2014 13:41:03 -0700 Hi All,I have following classes of features: class A: 15000 featuresclass B: 170 featuresclass C: 900 featuresClass D: 6000 features. I use linear regression (over sparse data). I get excellent

MLLib Linear regression

2014-10-07 Thread Sameer Tilak
Hi All,I have following classes of features: class A: 15000 featuresclass B: 170 featuresclass C: 900 featuresClass D: 6000 features. I use linear regression (over sparse data). I get excellent results with low RMSE (~0.06) for the following combinations of classes:1. A + B + C 2. B + C + D3. A

Re: MLlib Linear Regression Mismatch

2014-10-01 Thread Krishna Sankar
be 0.1 or 0.01? > > Best, > Burak > > - Original Message - > From: "Krishna Sankar" > To: user@spark.apache.org > Sent: Wednesday, October 1, 2014 12:43:20 PM > Subject: MLlib Linear Regression Mismatch > > Guys, >Obviously I am doing some

Re: MLlib Linear Regression Mismatch

2014-10-01 Thread Burak Yavuz
2:43:20 PM Subject: MLlib Linear Regression Mismatch Guys, Obviously I am doing something wrong. May be 4 points are too small a dataset. Can you help me to figure out why the following doesn't work ? a) This works : data = [ LabeledPoint(0.0, [0.0]), LabeledPoint(10.0, [10.0]), La

MLlib Linear Regression Mismatch

2014-10-01 Thread Krishna Sankar
Guys, Obviously I am doing something wrong. May be 4 points are too small a dataset. Can you help me to figure out why the following doesn't work ? a) This works : data = [ LabeledPoint(0.0, [0.0]), LabeledPoint(10.0, [10.0]), LabeledPoint(20.0, [20.0]), LabeledPoint(30.0, [30.0]) ]