This is an automated email from the ASF dual-hosted git repository.
janardhan pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/systemds.git
The following commit(s) were added to refs/heads/master by this push:
new 5b1258f Notebook for SystemDS MLContext on databricks
5b1258f is described below
commit 5b1258f8783126f515eda3f35fa9fde04948ac0f
Author: Janardhan Pulivarthi <[email protected]>
AuthorDate: Mon Aug 3 01:20:08 2020 +0530
Notebook for SystemDS MLContext on databricks
* Run SystemDS library loaded cluster, with MLContext.
* This notebook uses scala.
* Instructions for databricks notebook setup
* creating account at databricks community cloud
* Instructions to create cluster and add library for jar files
* Attaching the notebooks for the created cluster
NOTE: The .scala source file instead of .ipynb for sanity.
Closes #1000.
---
notebooks/databricks/MLContext.scala | 59 ++++++++++++++++++++++++++++++++++++
notebooks/databricks/README.md | 9 ++++++
2 files changed, 68 insertions(+)
diff --git a/notebooks/databricks/MLContext.scala
b/notebooks/databricks/MLContext.scala
new file mode 100644
index 0000000..dac190c
--- /dev/null
+++ b/notebooks/databricks/MLContext.scala
@@ -0,0 +1,59 @@
+// Databricks notebook source
+// MAGIC %md # Apache SystemDS on Databricks in 5 minutes
+
+// COMMAND ----------
+
+// MAGIC %md ## Create a quickstart cluster
+// MAGIC
+// MAGIC 1. In the sidebar, right-click the **Clusters** button and open the
link in a new window.
+// MAGIC 1. On the Clusters page, click **Create Cluster**.
+// MAGIC 1. Name the cluster **Quickstart**.
+// MAGIC 1. In the Databricks Runtime Version drop-down, select **6.3 (Scala
2.11, Spark 2.4.4)**.
+// MAGIC 1. Click **Create Cluster**.
+// MAGIC 1. Attach `SystemDS.jar` file to the libraries
+
+// COMMAND ----------
+
+// MAGIC %md ## Attach the notebook to the cluster and run all commands in the
notebook
+// MAGIC
+// MAGIC 1. Return to this notebook.
+// MAGIC 1. In the notebook menu bar, select **<img
src="http://docs.databricks.com/_static/images/notebooks/detached.png"/></a> >
Quickstart**.
+// MAGIC 1. When the cluster changes from <img
src="http://docs.databricks.com/_static/images/clusters/cluster-starting.png"/></a>
to <img
src="http://docs.databricks.com/_static/images/clusters/cluster-running.png"/></a>,
click **<img
src="http://docs.databricks.com/_static/images/notebooks/run-all.png"/></a> Run
All**.
+
+// COMMAND ----------
+
+// MAGIC %md ## Load SystemDS MLContext API
+
+// COMMAND ----------
+
+import org.apache.sysds.api.mlcontext._
+import org.apache.sysds.api.mlcontext.ScriptFactory._
+val ml = new MLContext(spark)
+
+// COMMAND ----------
+
+val habermanUrl =
"http://archive.ics.uci.edu/ml/machine-learning-databases/haberman/haberman.data"
+val habermanList = scala.io.Source.fromURL(habermanUrl).mkString.split("\n")
+val habermanRDD = sc.parallelize(habermanList)
+val habermanMetadata = new MatrixMetadata(306, 4)
+val typesRDD = sc.parallelize(Array("1.0,1.0,1.0,2.0"))
+val typesMetadata = new MatrixMetadata(1, 4)
+val scriptUrl =
"https://raw.githubusercontent.com/apache/systemds/master/scripts/algorithms/Univar-Stats.dml"
+val uni = dmlFromUrl(scriptUrl).in("A", habermanRDD, habermanMetadata).in("K",
typesRDD, typesMetadata).in("$CONSOLE_OUTPUT", true)
+ml.execute(uni)
+
+// COMMAND ----------
+
+// MAGIC %md #### Create a neural network layer with (R-like) DML language
+
+// COMMAND ----------
+
+val s = """
+ source("scripts/nn/layers/relu.dml") as relu;
+ X = rand(rows=100, cols=10, min=-1, max=1);
+ R1 = relu::forward(X);
+ R2 = max(X, 0);
+ R = sum(R1==R2);
+ """
+
+val ret = ml.execute(dml(s).out("R")).getScalarObject("R").getDoubleValue();
diff --git a/notebooks/databricks/README.md b/notebooks/databricks/README.md
new file mode 100644
index 0000000..e7596c9
--- /dev/null
+++ b/notebooks/databricks/README.md
@@ -0,0 +1,9 @@
+#### Setup SystemDS on Databricks platform
+
+1. Create a new account at [databricks
cloud](https://community.cloud.databricks.com/)
+2. In left-side navbar select **Clusters** > **`+ Create Cluster`** > Name the
cluster! > **`Create Cluster`**
+3. Navigate to the created cluster configuration.
+ a. Select **Libraries**
+ b. Select **Install New** > **Library Source [`Upload`]** and **Library
Type [`Jar`]**
+ c. Upload the `SystemDS.jar` file! > **`Install`**
+4. Attach a notebook to the cluster above.