Re: Python Logistic Regression error

2014-11-24 Thread Xiangrui Meng
The data is in LIBSVM format. So this line won't work: values = [float(s) for s in line.split(' ')] Please use the util function in MLUtils to load it as an RDD of LabeledPoint. http://spark.apache.org/docs/latest/mllib-data-types.html#labeled-point from pyspark.mllib.util import MLUtils

Python Logistic Regression error

2014-11-23 Thread Venkat, Ankam
Can you please suggest sample data for running the logistic_regression.py? I am trying to use a sample data file at https://github.com/apache/spark/blob/master/data/mllib/sample_linear_regression_data.txt I am running this on CDH5.2 Quickstart VM. [cloudera@quickstart mllib]$ spark-submit