You were using an old version of numpy, 1.4? I think this is fixed in
the latest master. Try to replace by,
target), or use the latest master. -Xiangrui

On Mon, Jun 30, 2014 at 2:04 PM, Sam Jacobs wrote:
> Hi,
> I modified the example code for logistic regression to compute the error in
> classification. Please see below. However the code is failing when it makes
> a call to:
> labelsAndPreds.filter(lambda (v, p): v != p).count()
> with the error message (something related to numpy or dot product):
> File "/opt/spark-1.0.0-bin-hadoop2/python/pyspark/mllib/",
> line 65, in predict
>     margin = _dot(x, self._coeff) + self._intercept
>   File "/opt/spark-1.0.0-bin-hadoop2/python/pyspark/mllib/", line
> 443, in _dot
>     return
> AttributeError: 'numpy.ndarray' object has no attribute 'dot'
> FYI, I am running the code using spark-submit i.e.
> ./bin/spark-submit examples/src/main/python/mllib/
> The code is posted below if it will be useful in any way:
> from math import exp
> import sys
> import time
> from pyspark import SparkContext
> from pyspark.mllib.classification import LogisticRegressionWithSGD
> from pyspark.mllib.regression import LabeledPoint
> from numpy import array
> # Load and parse the data
> def parsePoint(line):
>     values = [float(x) for x in line.split(',')]
>     if values[0] == -1:   # Convert -1 labels to 0 for MLlib
>         values[0] = 0
>     return LabeledPoint(values[0], values[1:])
> sc = SparkContext(appName="PythonLR")
> # start timing
> start = time.time()
> #start = time.clock()
> data = sc.textFile("sWAMSpark_train.csv")
> parsedData =
> # Build the model
> model = LogisticRegressionWithSGD.train(parsedData)
> #load test data
> testdata = sc.textFile("sWSpark_test.csv")
> parsedTestData =
> # Evaluating the model on test data
> labelsAndPreds = p: (p.label,
> model.predict(p.features)))
> trainErr = labelsAndPreds.filter(lambda (v, p): v != p).count() /
> float(parsedData.count())
> print("Training Error = " + str(trainErr))
> end = time.time()
> print("Time is = " + str(end - start))

