Re: Do I need to applied feature scaling via StandardScaler for LBFGS for Linear Regression?

DB Tsai Fri, 12 Dec 2014 11:27:07 -0800

It seems that your response is not scaled which will cause issue in
LBFGS. Typically, people train Linear Regression with
zero-mean/unit-variable feature and response without training the
intercept. Since the response is zero-mean, the intercept will be
always zero. When you convert the coefficients to the oringal space
from the scaled space, the intercept can be computed by w0 = y - \sum
<x_n> w_n where <x_n> is the average of column n.


Sincerely,

DB Tsai
-------------------------------------------------------
My Blog: https://www.dbtsai.com
LinkedIn: https://www.linkedin.com/in/dbtsai


On Fri, Dec 12, 2014 at 10:49 AM, Bui, Tri <tri....@verizonwireless.com> wrote:
> Thanks for the confirmation.
>
> Fyi..The code below works for similar dataset, but with the feature magnitude 
> changed,  LBFGS converged to the right weights.
>
> Example, time sequential Feature value 1, 2, 3, 4, 5, would generate the 
> error while sequential feature 14111, 14112, 14113,14115 would converge to  
> the right weight.  Why?
>
> Below is code to implement standardscaler() for sample data  
> (10246.0,[14111.0,1.0])):
>
> val scaler1 = new StandardScaler().fit(train.map(x => x.features))
> val train1 = train.map(x => (x.label, scaler1.transform(x.features)))
>
> But I  keeps on getting error: "value features is not a member of (Double, 
> org.apache.spark.mllib.linalg.Vector)"
>
> Should my feature vector be .toInt instead of Double?
>
> Also, the error  org.apache.spark.mllib.linalg.Vector  should have an "s" to 
> match import library org.apache.spark.mllib.linalg.Vectors
>
> Thanks
> Tri
>
>
>
>
>
> -----Original Message-----
> From: dbt...@dbtsai.com [mailto:dbt...@dbtsai.com]
> Sent: Friday, December 12, 2014 12:16 PM
> To: Bui, Tri
> Cc: user@spark.apache.org
> Subject: Re: Do I need to applied feature scaling via StandardScaler for 
> LBFGS for Linear Regression?
>
> You need to do the StandardScaler to help the convergency yourself.
> LBFGS just takes whatever objective function you provide without doing any 
> scaling. I will like to provide LinearRegressionWithLBFGS which does the 
> scaling internally in the nearly feature.
>
> Sincerely,
>
> DB Tsai
> -------------------------------------------------------
> My Blog: https://www.dbtsai.com
> LinkedIn: https://www.linkedin.com/in/dbtsai
>
>
> On Fri, Dec 12, 2014 at 8:49 AM, Bui, Tri 
> <tri....@verizonwireless.com.invalid> wrote:
>> Hi,
>>
>>
>>
>> Trying to use LBFGS as the optimizer, do I need to implement feature
>> scaling via StandardScaler or does LBFGS do it by default?
>>
>>
>>
>> Following code  generated error “ Failure again!  Giving up and
>> returning, Maybe the objective is just poorly behaved ?”.
>>
>>
>>
>> val data = sc.textFile("file:///data/Train/final2.train")
>>
>> val parsedata = data.map { line =>
>>
>> val partsdata = line.split(',')
>>
>> LabeledPoint(partsdata(0).toDouble, Vectors.dense(partsdata(1).split('
>> ').map(_.toDouble)))
>>
>> }
>>
>>
>>
>> val train = parsedata.map(x => (x.label,
>> MLUtils.appendBias(x.features))).cache()
>>
>>
>>
>> val numCorrections = 10
>>
>> val convergenceTol = 1e-4
>>
>> val maxNumIterations = 50
>>
>> val regParam = 0.1
>>
>> val initialWeightsWithIntercept = Vectors.dense(new Array[Double](2))
>>
>>
>>
>> val (weightsWithIntercept, loss) = LBFGS.runLBFGS(train,
>>
>>   new LeastSquaresGradient(),
>>
>>   new SquaredL2Updater(),
>>
>>   numCorrections,
>>
>>   convergenceTol,
>>
>>   maxNumIterations,
>>
>>   regParam,
>>
>>   initialWeightsWithIntercept)
>>
>>
>>
>> Did I implement LBFGS for Linear Regression via “LeastSquareGradient()”
>> correctly?
>>
>>
>>
>> Thanks
>>
>> Tri
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional 
> commands, e-mail: user-h...@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: Do I need to applied feature scaling via StandardScaler for LBFGS for Linear Regression?

Reply via email to