You probably need to scale the values in the data set so that they are
all of comparable ranges and translate them so that their means get to 0.
You can use pyspark.mllib.feature.StandardScaler(True, True) object for
that.
On 28.5.2015. 6:08, Maheshakya Wijewardena wrote:
Hi,
I'm trying
Hi,
I'm trying to use Sparks' *LinearRegressionWithSGD* in PySpark with the
attached dataset. The code is attached. When I check the model weights
vector after training, it contains `nan` values.
[nan,nan,nan,nan,nan,nan,nan,nan]
But for some data sets, this problem does not occur. What might