Convert RDD of numpy matrices to Dataframes

2016-11-08 Thread aditya1702
Hello,
I am trying out the MultilayerPerceptronClassifier and it takes only a
dataframe in its train method. Now the problem is that I have a training RDD
of labels (x,y) with x and y being matrices. X has dimensions (1,401) while
y has dimensions (1,10). I need to convert the train RDD to dataframe but on
doing so I get no input in my dataframe.




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Convert-RDD-of-numpy-matrices-to-Dataframes-tp28050.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Need help with SVM

2016-10-24 Thread aditya1702
Hello,
I am using linear SVM to train my model and generate a line through my data.
However my model always predicts 1 for all the feature examples. Here is my
code:

print data_rdd.take(5)
[LabeledPoint(1.0, [1.9643,4.5957]), LabeledPoint(1.0, [2.2753,3.8589]),
LabeledPoint(1.0, [2.9781,4.5651]), LabeledPoint(1.0, [2.932,3.5519]),
LabeledPoint(1.0, [3.5772,2.856])]


from pyspark.mllib.classification import SVMWithSGD
from pyspark.mllib.linalg import Vectors
from sklearn.svm import SVC
data_rdd=x_df.map(lambda x:LabeledPoint(x[1],x[0]))

model = SVMWithSGD.train(data_rdd, iterations=1000,regParam=1)

X=x_df.map(lambda x:x[0]).collect()
Y=x_df.map(lambda x:x[1]).collect()


pred=[]
for i in X:
  pred.append(model.predict(i))
print pred

[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1]


My dataset is as follows:

 


Can someone please help?



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Need-help-with-SVM-tp27955.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Plotting decision boundary in non-linear logistic regression

2016-10-21 Thread aditya1702
Hello,
I am working with Logistic Regression on a non linear data and I want to
plot a decision boundary using the data. I dont know how do I do it using
the contour plot. Could someone help me out please. This is the code I have
written:

from pyspark.ml.classification import LogisticRegression


lr=LogisticRegression(maxIter=1000,regParam=0.3,elasticNetParam=0.20)
model=lr.fit(data_train_df)

prediction = model.transform(data_test_df)
prediction.select(col('label'),col('prediction'))
final_pred_df=prediction.select(col('label'),col('prediction'))
ans=final_pred_df.where(col('label')==col('prediction')).count()
final_pred_df.show()
accuracy=ans/float(final_pred_df.count())
print accuracy*100

This gives the following output:

+-+--+
|label|prediction|
+-+--+
|  1.0|   1.0|
|  1.0|   1.0|
|  1.0|   1.0|
|  1.0|   1.0|
|  1.0|   1.0|
|  0.0|   0.0|
|  0.0|   0.0|
|  0.0|   1.0|
|  0.0|   1.0|
|  0.0|   1.0|
|  0.0|   1.0|
|  0.0|   1.0|
|  0.0|   0.0|
|  0.0|   0.0|
|  0.0|   0.0|
|  0.0|   0.0|
|  0.0|   0.0|
+-+--+

70.5882352941

Now how do I visualize this. The data plot is somewhat like this:

 



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Plotting-decision-boundary-in-non-linear-logistic-regression-tp27937.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: Making more features in Logistic Regression

2016-10-18 Thread aditya1702
 
 



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Making-more-features-in-Logistic-Regression-tp27915p27918.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: Making more features in Logistic Regression

2016-10-18 Thread aditya1702

 

 

Here is the graph and the features with their corresponding data



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Making-more-features-in-Logistic-Regression-tp27915p27917.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Making more features in Logistic Regression

2016-10-18 Thread aditya1702
Hello,
I am trying to solve a problem of Logistic Regression using Spark. I am
still a newbie to machine learning. I wanted to ask that if I have 2
features for logistic regression and if the features are non-linear
(regularized logistic regression) do we have to make more features by
considering the higher powers of the features?



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Making-more-features-in-Logistic-Regression-tp27915.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org