Article below gives a good idea.
http://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-2/
Play around with two configuration (large number of executor with small core,
and small executor with large core) . Calculated value have to be conservative
or it will make the
---
My Blog: https://www.dbtsai.com
LinkedIn: https://www.linkedin.com/in/dbtsai
On Fri, Dec 12, 2014 at 12:23 PM, Bui, Tri tri@verizonwireless.com wrote:
Thanks for the info.
How do I use StandardScaler() to scale example data (10246.0,[14111.0,1.0]) ?
Thx
tri
Hi,
Trying to use LBFGS as the optimizer, do I need to implement feature scaling
via StandardScaler or does LBFGS do it by default?
Following code generated error Failure again! Giving up and returning,
Maybe the objective is just poorly behaved ?.
val data =
Message-
From: dbt...@dbtsai.com [mailto:dbt...@dbtsai.com]
Sent: Friday, December 12, 2014 12:16 PM
To: Bui, Tri
Cc: user@spark.apache.org
Subject: Re: Do I need to applied feature scaling via StandardScaler for LBFGS
for Linear Regression?
You need to do the StandardScaler to help
Thanks for the info.
How do I use StandardScaler() to scale example data (10246.0,[14111.0,1.0]) ?
Thx
tri
-Original Message-
From: dbt...@dbtsai.com [mailto:dbt...@dbtsai.com]
Sent: Friday, December 12, 2014 1:26 PM
To: Bui, Tri
Cc: user@spark.apache.org
Subject: Re: Do I need
Thanks! Will try it out.
From: Debasish Das [mailto:debasish.da...@gmail.com]
Sent: Monday, December 08, 2014 5:13 PM
To: Bui, Tri
Cc: user@spark.apache.org
Subject: Re: Learning rate or stepsize automation
Hi Bui,
Please use BFGS based solvers...For BFGS you don't have to specify step size
Hi,
Is there any way to auto calculate the optimum learning rate or stepsize via
MLLIB for SGD ?
Thx
tri
Hi,
The following example code is able to build the correct model.weights, but its
prediction value is zero. Am I passing the PredictOnValues incorrectly? I
also coded a batch version base on LinearRegressionWithSGD() with the same
train and test data, iteration, stepsize info, and it was
values, which is the lp.features.
Thanks
Tri
From: Yanbo Liang [mailto:yanboha...@gmail.com]
Sent: Thursday, November 27, 2014 12:22 AM
To: Bui, Tri
Cc: user@spark.apache.org
Subject: Re: Inaccurate Estimate of weights model from
StreamingLinearRegressionWithSGD
Hi Tri,
Maybe my latest responds
Try
(hdfs:///localhost:8020/user/data/*)
With 3 /.
Thx
tri
-Original Message-
From: Benjamin Cuthbert [mailto:cuthbert@gmail.com]
Sent: Monday, December 01, 2014 4:41 PM
To: user@spark.apache.org
Subject: hdfs streaming context
All,
Is it possible to stream on HDFS directory
For the streaming example I am working on, Its accepted (hdfs:///user/data)
without the localhost info.
Let me dig through my hdfs config.
-Original Message-
From: Sean Owen [mailto:so...@cloudera.com]
Sent: Monday, December 01, 2014 4:50 PM
To: Benjamin Cuthbert
Cc:
Yep. No localhost
Usually, I use hdfs:///user/data to indicates I want hdfs or file:///user/data
to indicates local file directory.
-Original Message-
From: Sean Owen [mailto:so...@cloudera.com]
Sent: Monday, December 01, 2014 5:06 PM
To: Bui, Tri
Cc: Benjamin Cuthbert; user
)).setNumIterations(args(4).toInt).setStepSize(.0001)
model.trainOn(trainingData)
model.predictOnValues(testData.map(lp = (lp.label, lp.features))).print()
ssc.start()
ssc.awaitTermination()
Thanks
Tri
From: Bui, Tri [mailto:tri@verizonwireless.com.INVALID]
Sent: Tuesday, November 25, 2014 9
= (lp.label,
lp.features))).print()
[error] ^
[error] two errors found
[error] (compile:compile) Compilation failed
Thanks
Tri
From: Yanbo Liang [mailto:yanboha...@gmail.com]
Sent: Tuesday, November 25, 2014 8:57 PM
To: Bui, Tri
Cc: user@spark.apache.org
Subject: Re: Inaccurate Estimate
().setInitialWeights(Vectors.zeros(args(3).toInt))
.setIntercept(true)
But still get compilation error.
Thanks
Tri
From: Yanbo Liang [mailto:yanboha...@gmail.com]
Sent: Tuesday, November 25, 2014 4:08 AM
To: Bui, Tri
Cc: user@spark.apache.org
Subject: Re: Inaccurate Estimate of weights model from
Does this also apply to StreamingContext ?
What issue would I have if I have 1000s of StreaminContext ?
Thanks
Tri
From: Daniil Osipov [mailto:daniil.osi...@shazam.com]
Sent: Friday, November 14, 2014 3:47 PM
To: Charles
Cc: u...@spark.incubator.apache.org
Subject: Re: Mulitple Spark Context
Hi,
The model weight is not updating for streaming linear regression. The code and
data below is what I am running.
import org.apache.spark.mllib.linalg.Vectors
import org.apache.spark.mllib.regression.LabeledPoint
import org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
17 matches
Mail list logo