This is an automated email from the ASF dual-hosted git repository.

janardhan pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/systemds.git


The following commit(s) were added to refs/heads/main by this push:
     new 2f1c22b  [SYSTEMDS-2971] Run instructions to call DMLScript  via 
dataproc
2f1c22b is described below

commit 2f1c22b41fd5ecde51cc1c49826f253039d71219
Author: Janardhan Pulivarthi <[email protected]>
AuthorDate: Wed Nov 24 14:36:28 2021 +0530

    [SYSTEMDS-2971] Run instructions to call DMLScript  via dataproc
    
    * fix minor typo
    * download the bin artifacts from dlcdn.apache.org/systemds
---
 scripts/staging/google-cloud/README.md | 30 +++++++++++++++++++++++++++++-
 1 file changed, 29 insertions(+), 1 deletion(-)

diff --git a/scripts/staging/google-cloud/README.md 
b/scripts/staging/google-cloud/README.md
index 06a36dd..293e6dd 100644
--- a/scripts/staging/google-cloud/README.md
+++ b/scripts/staging/google-cloud/README.md
@@ -51,12 +51,34 @@ Jobs can be submitted via a Cloud Dataproc API
 
 Submit an example job using `gcloud` tool from the Cloud Shell command line
 
+Test that the cluster is setup properly:
+
 ```sh
-gcloud dataproc jobs submit spark --cluster ${CLUSTER_NAME} \
+gcloud dataproc jobs submit spark --cluster ${CLUSTERNAME} \
   --class org.apache.spark.examples.SparkPi \
   --jars file:///usr/lib/spark/examples/jars/spark-examples.jar -- 1000
 ```
 
+### Add SystemDS library to the cluster
+
+SSH into the cluster, download the artifacts from 
https://dlcdn.apache.org/systemds/
+and copy jar file in the `lib` folder.
+
+```sh
+gcloud compute ssh ${CLUSTERNAME}-m --zone=us-central1-c
+wget https://dlcdn.apache.org/systemds/2.2.0/systemds-2.2.0-bin.zip
+unzip -q systemds-2.2.0-bin.zip
+mkdir /usr/lib/systemds
+cp systemds-2.2.0-bin/systemds-2.2.0.jar /usr/lib/systemds
+```
+
+### Run SystemDS as a Spark job
+
+```sh
+gcloud dataproc jobs submit spark --cluster ${CLUSTERNAME} \
+  --class org.apache.sysds.api.DMLScript \
+  --jars file:///usr/lib/systemds/systemds-2.2.0.jar -- 1000
+```
 
 ### Job info and connect
 
@@ -115,6 +137,12 @@ to exit the cluster primary instance
 logout
 ```
 
+### Deleting the cluster
+
+```
+gcloud dataproc clusters delete ${CLUSTERNAME}
+```
+
 ### Tags
 
 A `--tags` option allows us to add a tag to each node in the cluster. Firewall 
rules

Reply via email to