liuxunorg commented on a change in pull request #59: SUBMARINE-254. Workbench server support cluster mode URL: https://github.com/apache/hadoop-submarine/pull/59#discussion_r337501200
########## File path: docs/design/SubmarineClusterServer.md ########## @@ -0,0 +1,110 @@ +# Submarine Cluster Server Design + +## Introduction +The Submarine system contains a total of two Server services, Submarine Server and Workbench Server, which are long-running in the form of Daemon. + +Among them, Submarine Server mainly provides job submission, job scheduling, job status monitoring, and model online service for Submarine. + +Workbench Server is mainly for Submarine Workbench WEB is mainly for algorithm users to provide algorithm development, Python/Spark interpreter operation and other services through Notebook. +The goal of the Submarine project is to provide high availability and high reliability services for big data processing, algorithm development, job scheduling, model online services, model batch and incremental updates. In addition to the high availability of big data and machine learning frameworks, the high availability of Submarine Server and Workbench Server itself is a key consideration. + +## Requirement + +### Cluster Metadata Center + +Multiple Submarine (or Workbench) Server processes create a Submarine Cluster through the RAFT algorithm library. The cluster internally maintains a metadata center. All servers can operate the metadata. The RAFT algorithm ensures that multiple processes are simultaneously co-located. A data modification will not cause problems such as mutual coverage and dirty data. + +This metadata center stores data by means of key-value pairs. It is very easy to use a variety of data, but it should be noted that metadata is only suitable for storing small amounts of data and cannot be used to replace data storage. Review comment: Done. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services