Repository: incubator-systemml
Updated Branches:
  refs/heads/gh-pages 940e1ea73 -> 254f9e5da


[SYSTEMML-1088] [SYSTEMML-1090] Remove the need for label mapping for Scala 
wrappers

- Also, removed the need to set SparkContext if already created (for
  example in interactive pyspark shell).
- Updated the documentation.


Project: http://git-wip-us.apache.org/repos/asf/incubator-systemml/repo
Commit: 
http://git-wip-us.apache.org/repos/asf/incubator-systemml/commit/254f9e5d
Tree: http://git-wip-us.apache.org/repos/asf/incubator-systemml/tree/254f9e5d
Diff: http://git-wip-us.apache.org/repos/asf/incubator-systemml/diff/254f9e5d

Branch: refs/heads/gh-pages
Commit: 254f9e5dadd601122fe861da4f1018dd2780e826
Parents: 940e1ea
Author: Niketan Pansare <npan...@us.ibm.com>
Authored: Wed Nov 16 15:26:42 2016 -0800
Committer: Niketan Pansare <npan...@us.ibm.com>
Committed: Wed Nov 16 15:28:31 2016 -0800

----------------------------------------------------------------------
 beginners-guide-python.md | 60 ++++++++++++++++++------------------------
 1 file changed, 25 insertions(+), 35 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-systemml/blob/254f9e5d/beginners-guide-python.md
----------------------------------------------------------------------
diff --git a/beginners-guide-python.md b/beginners-guide-python.md
index 65eef50..8d597bf 100644
--- a/beginners-guide-python.md
+++ b/beginners-guide-python.md
@@ -72,54 +72,41 @@ brew install apache-spark16
 
 #### Step 1: Install SystemML Python package 
 
+We are working towards uploading the python package on pypi. Until then, 
please use following commands: 
+
 ```bash
-pip install systemml
+git checkout https://github.com/apache/incubator-systemml.git
+cd incubator-systemml
+mvn post-integration-test -P distribution -DskipTests
+pip install src/main/python/dist/systemml-incubating-0.11.0.dev1.tar.gz
 ```
 
-#### Step 2: Download SystemML Java binaries
-
-SystemML Python package downloads the corresponding Java binaries (along with 
algorithms) and places them 
-into the installed location. To find the location of the downloaded Java 
binaries, use the following command:
-
+The above commands will install Python package and place the corresponding 
Java binaries (along with algorithms) into the installed location.
+To find the location of the downloaded Java binaries, use the following 
command:
 ```bash
 python -c 'import imp; import os; print 
os.path.join(imp.find_module("systemml")[1], "systemml-java")'
 ```
 
-#### Step 3: (Optional but recommended) Set SYSTEMML_HOME environment variable
-<div class="codetabs">
-<div data-lang="OSX" markdown="1">
-```bash
-SYSTEMML_HOME=`python -c 'import imp; import os; print 
os.path.join(imp.find_module("systemml")[1], "systemml-java")'`
-# If you are using zsh or ksh or csh, append it to ~/.zshrc or ~/.profile or 
~/.login respectively.
-echo '' >> ~/.bashrc
-echo 'export SYSTEMML_HOME='$SYSTEMML_HOME >> ~/.bashrc
-```
-</div>
-<div data-lang="Linux" markdown="1">
-```bash
-SYSTEMML_HOME=`python -c 'import imp; import os; print 
os.path.join(imp.find_module("systemml")[1], "systemml-java")'`
-# If you are using zsh or ksh or csh, append it to ~/.zshrc or ~/.profile or 
~/.login respectively.
-echo '' >> ~/.bashrc
-echo 'export SYSTEMML_HOME='$SYSTEMML_HOME >> ~/.bashrc
-```
-</div>
-</div>
-
 Note: the user is free to either use the prepackaged Java binaries 
 or download them from [SystemML 
website](http://systemml.apache.org/download.html) 
 or build them from the [source](https://github.com/apache/incubator-systemml).
 
+To uninstall SystemML, please use following command:
+```bash
+pip uninstall systemml-incubating
+```
+
 ### Start Pyspark shell
 
 <div class="codetabs">
 <div data-lang="OSX" markdown="1">
 ```bash
-pyspark --master local[*] --driver-class-path $SYSTEMML_HOME"/SystemML.jar"
+pyspark --master local[*]
 ```
 </div>
 <div data-lang="Linux" markdown="1">
 ```bash
-pyspark --master local[*] --driver-class-path $SYSTEMML_HOME"/SystemML.jar"
+pyspark --master local[*]
 ```
 </div>
 </div>
@@ -131,7 +118,6 @@ To get started with SystemML, let's try few elementary 
matrix multiplication ope
 ```python
 import systemml as sml
 import numpy as np
-sml.setSparkContext(sc)
 m1 = sml.matrix(np.ones((3,3)) + 2)
 m2 = sml.matrix(np.ones((3,3)) + 3)
 m2 = m1 * (m2 + m1)
@@ -166,7 +152,6 @@ X_test = diabetes_X[-20:]
 y_train = diabetes.target[:-20]
 y_test = diabetes.target[-20:]
 # Train Linear Regression model
-sml.setSparkContext(sc)
 X = sml.matrix(X_train)
 y = sml.matrix(y_train)
 A = X.transpose().dot(X)
@@ -236,7 +221,7 @@ from pyspark.sql import SQLContext
 sqlCtx = SQLContext(sc)
 digits = datasets.load_digits()
 X_digits = digits.data
-y_digits = digits.target + 1
+y_digits = digits.target 
 n_samples = len(X_digits)
 X_train = X_digits[:.9 * n_samples]
 y_train = y_digits[:.9 * n_samples]
@@ -260,18 +245,23 @@ To train the above algorithm on larger dataset, we can 
load the dataset into Dat
 from sklearn import datasets, neighbors
 from systemml.mllearn import LogisticRegression
 from pyspark.sql import SQLContext
+import pandas as pd
+from sklearn.metrics import accuracy_score
 import systemml as sml
 sqlCtx = SQLContext(sc)
 digits = datasets.load_digits()
 X_digits = digits.data
-y_digits = digits.target + 1
+y_digits = digits.target
 n_samples = len(X_digits)
 # Split the data into training/testing sets and convert to PySpark DataFrame
 df_train = sml.convertToLabeledDF(sqlContext, X_digits[:.9 * n_samples], 
y_digits[:.9 * n_samples])
-X_test = X_digits[.9 * n_samples:]
-y_test = y_digits[.9 * n_samples:]
+X_test = sqlCtx.createDataFrame(pd.DataFrame(X_digits[.9 * n_samples:]))
 logistic = LogisticRegression(sqlCtx)
-print('LogisticRegression score: %f' % logistic.fit(df_train).score(X_test, 
y_test))
+logistic.fit(df_train)
+y_predicted = logistic.predict(X_test)
+y_predicted = y_predicted.select('prediction').toPandas().as_matrix().flatten()
+y_test = y_digits[.9 * n_samples:]
+print('LogisticRegression score: %f' % accuracy_score(y_test, y_predicted))
 ```
 
 Output:

Reply via email to