Sourav Mazumder created SYSTEMML-519:
----------------------------------------

             Summary: A Zeppelin Notebook showcasing how to use SystemML APIs 
and existing scripts on Spark
                 Key: SYSTEMML-519
                 URL: https://issues.apache.org/jira/browse/SYSTEMML-519
             Project: SystemML
          Issue Type: Documentation
          Components: Documentation
    Affects Versions: SystemML 0.9
            Reporter: Sourav Mazumder
            Priority: Minor


Need a sample Zeppelin Notebook which showcases use of SystemML on Spark from 
Zeppelin. The Notebook sample covers following end to end aspects of creating a 
model using SystemML

1. Ingestion of multiple Datasets from HDFS.
2. Exploration of the Data using Spark SQL
3. Merging of the data from various data sources for preparing data for 
building Model
4. Building Model using GLM.dml of SystemML
5. Using GLM-predict.dml for prediction using larger population
6. Relating the prediction back to original dataset
7. Visualization of Prediction using R libraries using SparkR.

Please note that this notebook uses a R interpreter for Zeppelin 
(https://github.com/apache/incubator-zeppelin/pull/208/commits) which is not 
part of main branch. So the SparkR paragraphs will not work if someone is using 
Zeppelin main branch. Alternatively one can use the branch related to this PR 
(PR#208). Also one has to have R and other relevant R packages installed 
separately on the same machine where Zeppelin process is running. the other R 
packages needed for the plots are googleVis, ggplot2, maptools, htmltools, 
knitr, repr (from http://irkernel.github.io/).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to