DaveDuggins commented on code in PR #15809:
URL: https://github.com/apache/pulsar/pull/15809#discussion_r922398210


##########
site2/docs/architecture-overview.md:
##########
@@ -0,0 +1,143 @@
+---
+
+id: concepts-architecture-overview
+
+title: Architecture overview
+
+sidebar_label: Concepts
+
+---
+
+The following overview describes the components that make up a Pulsar cluster, 
from general to specific.  
+
+### Instance
+
+***
+
+A Pulsar instance is composed of one or more Pulsar clusters. Clusters within 
an instance can [replicate](concepts-replication.md) data amongst themselves.
+
+### Cluster
+
+***
+
+![Pulsar architecture diagram](/assets/pulsar-system-architecture.svg)
+
+In a Pulsar cluster:
+
+* One or more **brokers** handles and load balances incoming messages from 
**producers**, dispatches **messages** to **consumers**, communicates with the 
Pulsar **configuration store** to handle various coordination tasks, stores 
messages in BookKeeper instances (aka **bookies**), relies on a 
cluster-specific ZooKeeper cluster for certain tasks, and more.
+
+* A BookKeeper cluster consisting of one or more bookies handles [persistent 
storage](#persistent-storage) of messages.
+
+* A ZooKeeper cluster specific to that cluster handles coordination tasks 
between Pulsar clusters.
+
+An instance-wide ZooKeeper cluster called the Configuration Store handles 
coordination tasks involving multiple clusters, for example 
[geo-replication](concepts-replication.md).
+
+For a guide to managing Pulsar clusters, see the 
[clusters](admin-api-clusters.md) guide.
+
+### Producer
+
+***
+
+A producer is a process that attaches to a topic and publishes messages to a 
Pulsar [broker](reference-terminology.md#broker). The Pulsar broker processes 
the messages.
+
+Refer to the [producer](concepts-producer.md) topic for more information.
+
+### Topic
+
+***
+
+![Topic](/assets/producer-topic-consumer.svg)
+
+As in other pub-sub systems, topics in Pulsar are named channels for 
transmitting messages from producers to consumers. Topic names are URLs that 
have a well-defined structure:
+
+```http
+
+{persistent|non-persistent}://tenant/namespace/topic
+
+```
+
+| Topic name component | Description |
+|:--------------------|:-----------|
+| persistent / non-persistent | This identifies the type of topic. Pulsar 
supports two kind of topics: 
[persistent](concepts-architecture-overview.md#persistent-storage) and 
[non-persistent](#non-persistent-topics). The default is persistent, so if you 
do not specify a type, the topic is persistent. With persistent topics, all 
messages are durably persisted on disks (if the broker is not standalone, 
messages are durably persisted on multiple disks), whereas data for 
non-persistent topics is not persisted to storage disks.
+tenant             | The topic tenant within the instance. Tenants are 
essential to multi-tenancy in Pulsar, and spread across clusters.
+|`namespace`          | The administrative unit of the topic, which acts as a 
grouping mechanism for related topics. Most topic configuration is performed at 
the [namespace](#namespaces) level. Each tenant has one or multiple namespaces.
+|topic              | The final part of the name. Topic names have no special 
meaning in a Pulsar instance.
+
+![tenants](/assets/tenants.svg)
+
+Refer to [topic](concepts-topic.md) for more information.
+
+### Consumer
+
+***
+
+A consumer is a process that attaches to a topic via a subscription and then 
receives messages.
+
+A consumer sends a [flow permit 
request](developing-binary-protocol.md#flow-control) to a broker to get 
messages. There is a queue at the consumer side to receive messages pushed from 
the broker. You can configure the queue size with the 
[`receiverQueueSize`](client-libraries-java.md#configure-consumer) parameter. 
The default size is `1000`). Each time `consumer.receive()` is called, a 
message is dequeued from the buffer.  
+
+Refer to the [consumer](concepts-consumer.md) topic for more information.
+
+### Broker
+
+***
+
+The **Pulsar message broker** is a stateless component that's primarily 
responsible for running two other components:
+
+* An HTTP server that exposes an {@inject: rest:REST:/} API for both 
administrative tasks and [topic lookup](concepts-clients.md#client-setup-phase) 
for producers and consumers. The producers connect to the brokers to publish 
messages and the consumers connect to the brokers to consume the messages.
+
+* A dispatcher, which is an asynchronous TCP server over a custom [binary 
protocol](developing-binary-protocol.md) used for all data transfers.
+
+![Broker](/assets/broker.svg)
+
+Messages are typically dispatched out of a [managed ledger](#managed-ledgers) 
cache for the sake of performance, *unless* the backlog exceeds the cache size. 
If the backlog grows too large for the cache, the broker will start reading 
entries from BookKeeper.
+
+Finally, to support geo-replication on global topics, the broker manages 
replicators that tail the entries published in the local region and republish 
them to the remote region using the Pulsar [Java client 
library](client-libraries-java.md).
+
+> For a guide to managing Pulsar brokers, see the 
[brokers](admin-api-brokers.md) guide.
+
+### Namespace
+
+***
+
+![Namespace](/assets/namespace.svg)
+
+A namespace is a logical nomenclature within a tenant. A tenant creates 
multiple namespaces via the [admin API](admin-api-namespaces.md#create). For 
instance, a tenant with different applications can create a separate namespace 
for each application. A namespace allows the application to create and manage a 
hierarchy of topics. The topic `my-tenant/app1` is a namespace for the 
application `app1` for `my-tenant`. You can create any number of 
[topics](#topics) under the namespace.
+
+### Metadata Store
+
+***
+
+The Pulsar metadata store maintains all the metadata of a Pulsar cluster, such 
as topic metadata, schema, broker load data, and so on. Pulsar uses [Apache 
ZooKeeper](https://zookeeper.apache.org/) for metadata storage, cluster 
configuration, and coordination. The Pulsar metadata store can be deployed on a 
separate ZooKeeper cluster or an existing ZooKeeper cluster. You can use one 
ZooKeeper cluster for both Pulsar metadata store and [BookKeeper metadata 
store](https://bookkeeper.apache.org/docs/latest/getting-started/concepts/#metadata-storage).
 If you want to deploy Pulsar brokers connected to an existing BookKeeper 
cluster, you need to deploy separate ZooKeeper clusters for Pulsar metadata 
store and BookKeeper metadata store respectively.
+
+> Pulsar also supports more metadata backend services, including 
[ETCD](https://etcd.io/) and [RocksDB](http://rocksdb.org/) (for standalone 
Pulsar only). 

Review Comment:
   It is prominent enough without highlighting. It's the first item on the page 
and there are several code snippets in the bullet statements, which draw 
attention. Leaving as is.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to