skorper commented on code in PR #185:
URL: 
https://github.com/apache/incubator-sdap-nexus/pull/185#discussion_r984133683


##########
docs/quickstart.rst:
##########
@@ -18,43 +18,59 @@ This quickstart guide will walk you through how to install 
and run NEXUS on your
 Prerequisites
 ==============
 
-* Docker (tested on v18.03.1-ce)
+* Docker (tested on v20.10.17)
 * Internet Connection
-* bash
+* bash or zsh
 * cURL
-* 500 MB of disk space
+* 10.5 GB of disk space
 
 Prepare
 ========
 
-Start downloading the Docker images and data files.
+Start downloading the Docker images and set up the Docker bridge network.
 
 .. _quickstart-step1:
 
 Pull Docker Images
 -------------------
 
-Pull the necessary Docker images from the `SDAP repository 
<https://hub.docker.com/u/sdap>`_ on Docker Hub. Please check the repository 
for the latest version tag.
+Pull the necessary Docker images from the `NEXUS JPL repository 
<https://hub.docker.com/u/nexusjpl>`_ on Docker Hub. Please check the 
repository for the latest version tag.
 
 .. code-block:: bash
 
-  export VERSION=1.0.0-rc1
+  export CASSANDRA_VERSION=3.11.6-debian-10-r138
+  export RMQ_VERSION=3.8.9-debian-10-r37
+  export COLLECTION_MANAGER_VERSION=0.1.6a14
+  export GRANULE_INGESTER_VERSION=0.1.6a30
+  export WEBAPP_VERSION=distributed.0.4.5a49
+  export SOLR_VERSION=8.11.1
+  export SOLR_CLOUD_INIT_VERSION=1.0.2
+  export ZK_VERSION=3.5.5
+
+  export JUPYTER_VERSION=1.2
 
 .. code-block:: bash
 
-  docker pull sdap/ningester:${VERSION}
-  docker pull sdap/solr-singlenode:${VERSION}
-  docker pull sdap/cassandra:${VERSION}
-  docker pull sdap/nexus-webapp:standalone.${VERSION}
+  docker pull bitnami/cassandra:${CASSANDRA_VERSION}
+  docker pull bitnami/rabbitmq:${RMQ_VERSION}
+  docker pull nexusjpl/collection-manager:${COLLECTION_MANAGER_VERSION}
+  docker pull nexusjpl/granule-ingester:${GRANULE_INGESTER_VERSION}
+  docker pull nexusjpl/nexus-webapp:${WEBAPP_VERSION}
+  docker pull nexusjpl/solr:${SOLR_VERSION}
+  docker pull nexusjpl/solr-cloud-init:${SOLR_CLOUD_INIT_VERSION}
+  docker pull zookeeper:${ZK_VERSION}
+
+  # docker pull nexusjpl/jupyter:${JUPYTER_VERSION}

Review Comment:
   let's update this back to nexusjpl before we merge this



##########
CHANGELOG.md:
##########
@@ -41,9 +41,9 @@ and this project adheres to [Semantic 
Versioning](https://semver.org/spec/v2.0.0
 - Fixed issue where satellite to satellite matchups with the same dataset 
don't return the expected result
 - Fixed CSV and NetCDF matchup output bug
 - Fixed NetCDF output switching latitude and longitude
+- SDAP-399: Updated quickstart guide for standalone docker deployment of SDAP.
+- SDAP-399: Updated quickstart Jupyter notebook
 - Fixed import error causing `/timeSeriesSpark` queries to fail.
 - Fixed bug where domsresults no longer worked after successful matchup
 - Fixed certificate error in Dockerfile
-### Security

Review Comment:
   you can leave this section in empty in case we make a security change



##########
docker/jupyter/Dockerfile:
##########
@@ -29,7 +29,8 @@ ENV CHOWN_HOME_OPTS='-R'
 ENV REBUILD_CODE=true
 
 ARG APACHE_NEXUS=https://github.com/apache/incubator-sdap-nexus.git
-ARG APACHE_NEXUS_BRANCH=master
+ARG APACHE_NEXUS_COMMIT=be19c1d567301b09269e851cc5b5af55fea02c5d

Review Comment:
   This commit is from 2019... ideally, I think we'd like this to run on the 
latest code? What issues did you run when running this on master?



##########
docker/jupyter/requirements.txt:
##########
@@ -1,4 +1,3 @@
-shapely
-requests
-numpy
-cassandra-driver==3.9.0

Review Comment:
   I'm curious why we were able to remove this? Or why it was needed before but 
not now?



##########
docs/quickstart.rst:
##########
@@ -64,181 +80,240 @@ The network we will be using for this quickstart will be 
called ``sdap-net``. Cr
 
 .. _quickstart-step3:
 
-Download Sample Data
----------------------
+Start Ingester Components and Ingest Some Science Data
+========================================================
 
-The data we will be downloading is part of the `AVHRR OI dataset 
<https://podaac.jpl.nasa.gov/dataset/AVHRR_OI-NCEI-L4-GLOB-v2.0>`_ which 
measures sea surface temperature. We will download 1 month of data and ingest 
it into a local Solr and Cassandra instance.
+Create Data Directory
+------------------------
+
+Let's start by creating the directory to hold the science data to ingest.
 
 Choose a location that is mountable by Docker (typically needs to be under the 
User's home directory) to download the data files to.
 
 .. code-block:: bash
 
-  export DATA_DIRECTORY=~/nexus-quickstart/data/avhrr-granules
-  mkdir -p ${DATA_DIRECTORY}
+    export DATA_DIRECTORY=~/nexus-quickstart/data/avhrr-granules
+    mkdir -p ${DATA_DIRECTORY}
 
-Then go ahead and download 1 month worth of AVHRR netCDF files.
+Now we can start up the data storage components. We will be using Solr and 
Cassandra to store the tile metadata and data respectively.
 
-.. code-block:: bash
+.. _quickstart-step4:
 
-  cd $DATA_DIRECTORY
+Start Zookeeper
+---------------
 
-  export 
URL_LIST="https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/305/20151101120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/306/20151102120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/307/20151103120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/308/20151104120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/309/20151105120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/
 2015/310/20151106120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/311/20151107120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/312/20151108120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/313/20151109120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/314/20151110120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/315/20151111120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 https://podaac-opendap.jpl.nasa.gov:443
 
/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/316/20151112120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/317/20151113120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/318/20151114120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/319/20151115120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/320/20151116120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/321/20151117120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-G
 LOB-v02.0-fv02.0.nc 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/322/20151118120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/323/20151119120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/324/20151120120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/325/20151121120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/326/20151122120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2
 /2015/327/20151123120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/328/20151124120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/329/20151125120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/330/20151126120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/331/20151127120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/332/20151128120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 https://podaac-opendap.jpl.nasa.gov:44
 
3/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/333/20151129120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc
 
https://podaac-opendap.jpl.nasa.gov:443/opendap/allData/ghrsst/data/GDS2/L4/GLOB/NCEI/AVHRR_OI/v2/2015/334/20151130120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.0.nc";
+In order to run Solr in cloud mode, we must first run Zookeeper.
 
-  for url in ${URL_LIST}; do
-    curl -O "${url}"
-  done
+.. code-block:: bash
 
-You should now have 30 files downloaded to your data directory, one for each 
day in November 2015.
+    docker run --name zookeeper -dp 2181:2181 zookeeper:${ZK_VERSION}
 
-Start Data Storage Containers
-==============================
+We then need to ensure the ``/solr`` znode is present.
 
-We will use Solr and Cassandra to store the tile metadata and data 
respectively.
+.. code-block:: bash
 
-.. _quickstart-step4:
+  docker exec zookeeper bash -c "bin/zkCli.sh create /solr"
+
+.. _quickstart-step5:
 
 Start Solr
 -----------
 
-SDAP is tested with Solr version 7.x with the JTS topology suite add-on 
installed. The SDAP docker image is based off of the official Solr image and 
simply adds the JTS topology suite and the nexustiles core.
+SDAP is tested with Solr version 8.11.1.
 
-.. note:: Mounting a volume is optional but if you choose to do it, you can 
start and stop the Solr container without having to reingest your data every 
time. If you do not mount a volume, every time you stop your Solr container the 
data will be lost.
+.. note:: Mounting a volume is optional but if you choose to do it, you can 
start and stop the Solr container without having to reingest your data every 
time. If you do not mount a volume, every time you stop your Solr container the 
data will be lost. If you don't want a volume, leave off the ``-v`` option in 
the following ``docker run`` command.
 
 To start Solr using a volume mount and expose the admin webapp on port 8983:
 
 .. code-block:: bash
 
   export SOLR_DATA=~/nexus-quickstart/solr
-  docker run --name solr --network sdap-net -v 
${SOLR_DATA}:/opt/solr/server/solr/nexustiles/data -p 8983:8983 -d 
sdap/solr-singlenode:${VERSION}
+  mkdir -p ${SOLR_DATA}
+  docker run --name solr --network sdap-net -v 
${SOLR_DATA}/:/opt/solr/server/solr/nexustiles/data -p 8983:8983 -e 
ZK_HOST="host.docker.internal:2181/solr" -d nexusjpl/solr:${SOLR_VERSION}
+
+This will start an instance of Solr. To initialize it, we need to run the 
``solr-cloud-init`` image.
 
-If you don't want to use a volume, leave off the ``-v`` option.
+.. code-block:: bash
 
+  docker run -it --rm --name solr-init --network sdap-net -e 
SDAP_ZK_SOLR="host.docker.internal:2181/solr" -e 
SDAP_SOLR_URL="http://host.docker.internal:8983/solr/"; -e 
CREATE_COLLECTION_PARAMS="name=nexustiles&numShards=1&waitForFinalState=true" 
nexusjpl/solr-cloud-init:${SOLR_CLOUD_INIT_VERSION}
 
-.. _quickstart-step5:
+When the init script finishes, kill the container by typing ``Ctrl + C``
 
-Start Cassandra
-----------------
+.. _quickstart-step6:
 
-SDAP is tested with Cassandra version 2.2.x. The SDAP docker image is based 
off of the official Cassandra image and simply mounts the schema DDL script 
into the container for easy initialization.
+Starting Cassandra
+-------------------
+
+SDAP is tested with Cassandra version 3.11.6.
 
-.. note:: Similar to the Solr container, using a volume is recommended but not 
required.
+.. note:: Similar to the Solr container, using a volume is recommended but not 
required. Be aware that the second ``-v`` option is required.
 
-To start cassandra using a volume mount and expose the connection port 9042:
+Before starting Cassandra, we need to prepare a script to initialize the 
database.
+
+.. code-block:: bash
+
+  export CASSANDRA_INIT=~/nexus-quickstart/init
+  mkdir -p ${CASSANDRA_INIT}
+  cat << EOF >> ${CASSANDRA_INIT}/initdb.cql
+  CREATE KEYSPACE IF NOT EXISTS nexustiles WITH REPLICATION = { 'class': 
'SimpleStrategy', 'replication_factor': 1 };
+
+  CREATE TABLE IF NOT EXISTS nexustiles.sea_surface_temp  (
+  tile_id      uuid PRIMARY KEY,
+  tile_blob    blob
+  );
+  EOF
+
+Now we can start the image and run the initialization script.
 
 .. code-block:: bash
 
   export CASSANDRA_DATA=~/nexus-quickstart/cassandra
-  docker run --name cassandra --network sdap-net -p 9042:9042 -v 
${CASSANDRA_DATA}:/var/lib/cassandra -d sdap/cassandra:${VERSION}
+  mkdir -p ${CASSANDRA_DATA}
+  docker run --name cassandra --network sdap-net -p 9042:9042 -v 
${CASSANDRA_DATA}/cassandra/:/var/lib/cassandra -v 
"${CASSANDRA_INIT}/initdb.cql:/scripts/initdb.cql" -d 
bitnami/cassandra:${CASSANDRA_VERSION}
 
-.. _quickstart-step6:
+Wait a few moments for the database to start.
+
+.. code-block:: bash
+
+  docker exec  cassandra bash -c "cqlsh -u cassandra -p cassandra -f 
/scripts/initdb.cql"
 
-Ingest Data
-============
+With Solr and Cassandra started and initialized, we can now start the 
collection manager and granule ingester(s).
 
-Now that Solr and Cassandra have both been started and configured, we can 
ingest some data. NEXUS ingests data using the ningester docker image. This 
image is designed to read configuration and data from volume mounts and then 
tile the data and save it to the datastores. More information can be found in 
the :ref:`ningester` section.
+.. _quickstart-step7:
 
-Ningester needs 3 things to run:
+Start RabbitMQ
+----------------
+
+The collection manager and granule ingester(s) use RabbitMQ to communicate, so 
we need to start that up first.
+
+.. code-block:: bash
 
-#. Tiling configuration. How should the dataset be tiled? What is the dataset 
called? Are there any transformations that need to happen (e.g. kelvin to 
celsius conversion)? etc...
-#. Connection configuration. What should be used for metadata storage and 
where can it be found? What should be used for data storage and where can it be 
found?
-#. Data files. The data that will be ingested.
+  docker run -dp 5672:5672 -p 15672:15672 --name rmq --network sdap-net 
bitnami/rabbitmq:${RMQ_VERSION}
 
-Tiling configuration
+.. _quickstart-step8:
+
+Start the Granule Ingester(s)
+-----------------------------
+
+The granule ingester(s) read new granules from the message queue and process 
them into tiles. For the set of granules we will be using in this guide, we 
recommend using two ingester containers to speed up the process.
+
+.. code-block:: bash
+
+  docker run --name granule-ingester-1 --network sdap-net -e 
RABBITMQ_HOST="host.docker.internal:5672" -e RABBITMQ_USERNAME="user" -e 
RABBITMQ_PASSWORD="bitnami" -d -e CASSANDRA_CONTACT_POINTS=host.docker.internal 
-e CASSANDRA_USERNAME=cassandra -e CASSANDRA_PASSWORD=cassandra -e 
SOLR_HOST_AND_PORT="http://host.docker.internal:8983"; -v 
${DATA_DIRECTORY}:/data/granules/ 
nexusjpl/granule-ingester:${GRANULE_INGESTER_VERSION}

Review Comment:
   Perhaps some newlines here to make this more readable
   
   ```bash
   docker run --name granule-ingester-1 --network sdap-net -e 
RABBITMQ_HOST="host.docker.internal:5672" \
        -e RABBITMQ_USERNAME="user" -e RABBITMQ_PASSWORD="bitnami" -d -e 
CASSANDRA_CONTACT_POINTS=host.docker.internal \
        -e CASSANDRA_USERNAME=cassandra -e CASSANDRA_PASSWORD=cassandra -e 
SOLR_HOST_AND_PORT="http://host.docker.internal:8983"; \
        -v ${DATA_DIRECTORY}:/data/granules/ 
nexusjpl/granule-ingester:${GRANULE_INGESTER_VERSION}
   ```



##########
docs/quickstart.rst:
##########
@@ -64,181 +80,240 @@ The network we will be using for this quickstart will be 
called ``sdap-net``. Cr
 
 .. _quickstart-step3:
 
-Download Sample Data
----------------------
+Start Ingester Components and Ingest Some Science Data

Review Comment:
   Sorry for being nitpicky, but I don't think makes a very good "H1" header. 
Maybe some simpler like "Ingest Data" or break it apart further?



##########
docs/quickstart.rst:
##########
@@ -18,43 +18,59 @@ This quickstart guide will walk you through how to install 
and run NEXUS on your
 Prerequisites
 ==============
 
-* Docker (tested on v18.03.1-ce)
+* Docker (tested on v20.10.17)
 * Internet Connection
-* bash
+* bash or zsh
 * cURL
-* 500 MB of disk space
+* 10.5 GB of disk space

Review Comment:
   😱 why is so much disk space needed??



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@sdap.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to