Sourav Mazumder created SYSTEMML-519: ----------------------------------------
Summary: A Zeppelin Notebook showcasing how to use SystemML APIs and existing scripts on Spark Key: SYSTEMML-519 URL: https://issues.apache.org/jira/browse/SYSTEMML-519 Project: SystemML Issue Type: Documentation Components: Documentation Affects Versions: SystemML 0.9 Reporter: Sourav Mazumder Priority: Minor Need a sample Zeppelin Notebook which showcases use of SystemML on Spark from Zeppelin. The Notebook sample covers following end to end aspects of creating a model using SystemML 1. Ingestion of multiple Datasets from HDFS. 2. Exploration of the Data using Spark SQL 3. Merging of the data from various data sources for preparing data for building Model 4. Building Model using GLM.dml of SystemML 5. Using GLM-predict.dml for prediction using larger population 6. Relating the prediction back to original dataset 7. Visualization of Prediction using R libraries using SparkR. Please note that this notebook uses a R interpreter for Zeppelin (https://github.com/apache/incubator-zeppelin/pull/208/commits) which is not part of main branch. So the SparkR paragraphs will not work if someone is using Zeppelin main branch. Alternatively one can use the branch related to this PR (PR#208). Also one has to have R and other relevant R packages installed separately on the same machine where Zeppelin process is running. the other R packages needed for the plots are googleVis, ggplot2, maptools, htmltools, knitr, repr (from http://irkernel.github.io/). -- This message was sent by Atlassian JIRA (v6.3.4#6332)