Hi, I am running apache zeppelin on a digital ocean droplet where I have created a Spark standalone cluster also. I am trying to extract data from mongoDB using mongo-spark connector by stratio which is taking around 8 min to extract a single collection of 12 million documents from the mongoDB replica sets. So for testing purposes I will like to increase my worker instance. Please help me Thank You.
I am relatively new to spark and zeppelin. zeppelin-env.sh export ZEPPELIN_JAVA_OPTS="-Dspark.executor.memory=6g" export SPARK_HOME="/home/username/spark" export SPARK_SUBMIT_OPTIONS="--packages com.stratio.datasource:spark-mongodb_2.10:0.10.1" export MASTER=spark://master-ip:7077 export ZEPPELIN_PORT=8088 export ZEPPELIN_MEM=-Xmx8g export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64 Thanks. Karthik