Hi Gustavo,
You can put that pyspark details in the jupyter console itself.
import os
import sys
import pandas as pd
import numpy as np
spark_path = "C:\spark"
os.environ['SPARK_HOME'] = spark_path
os.environ['HADOOP_HOME'] = spark_path
sys.path.append(spark_path + "/bin")
sys.path.append(spark_path + "/python")
sys.path.append(spark_path + "/python/pyspark/")
sys.path.append(spark_path + "/python/lib")
sys.path.append(spark_path + "/python/lib/pyspark.zip")
sys.path.append(spark_path + "/python/lib/py4j-0.10.4-src.zip")
from pyspark import SparkContext
from pyspark import SparkConf
sc = SparkContext("local[*]", "test")
# SystemML Specifications:
from pyspark.sql import SQLContext
import systemml as sml
sqlCtx = SQLContext(sc)
ml = sml.MLContext(sc)
But this is not a very good way of doing it. I did it as I'm using windows and
it's easier to do it like that.
Regards,
Arijit
________________________________
From: Gustavo Frederico <[email protected]>
Sent: Sunday, July 2, 2017 10:16:03 AM
To: [email protected]
Subject: Install - Configure Jupyter Notebook
A basic question: step 3 in https://systemml.apache.org/install-systemml.html
<https://systemml.apache.org/install-systemml.html> for “Configure Jupyter
Notebook” has
# Start Jupyter Notebook Server
PYSPARK_DRIVER_PYTHON=jupyter PYSPARK_DRIVER_PYTHON_OPTS="notebook" pyspark
--master local[*] --conf "spark.driver.memory=12g" --conf
spark.driver.maxResultSize=0 --conf spark.akka.frameSize=128 --conf
spark.default.parallelism=100
Where does that go? There are no details in this step…
Thanks
Gustavo