congbobo184 commented on code in PR #18242:
URL: https://github.com/apache/pulsar/pull/18242#discussion_r1013578128
##########
site2/docs/schema-understand.md:
##########
@@ -121,109 +81,15 @@ Currently, Pulsar supports the following complex types:
| `keyvalue` | Represents a complex type of a key/value pair. |
| `struct` | Handles structured data. It supports `AvroBaseStructSchema` and
`ProtobufNativeSchema`. |
-#### keyvalue
-
-`Keyvalue` schema helps applications define schemas for both key and value.
+#### `keyvalue` schema
Review Comment:
```suggestion
#### `KeyValue` schema
```
##########
site2/docs/schema-understand.md:
##########
@@ -121,109 +81,15 @@ Currently, Pulsar supports the following complex types:
| `keyvalue` | Represents a complex type of a key/value pair. |
| `struct` | Handles structured data. It supports `AvroBaseStructSchema` and
`ProtobufNativeSchema`. |
-#### keyvalue
-
-`Keyvalue` schema helps applications define schemas for both key and value.
+#### `keyvalue` schema
-For `SchemaInfo` of `keyvalue` schema, Pulsar stores the `SchemaInfo` of key
schema and the `SchemaInfo` of value schema together.
+`Keyvalue` schema helps applications define schemas for both key and value.
Pulsar stores the `SchemaInfo` of key schema and the `SchemaInfo` of value
schema together.
Review Comment:
```suggestion
`KeyValue` schema helps applications define schemas for both key and value.
Pulsar stores the `SchemaInfo` of key schema and the value schema together.
```
##########
site2/docs/schema-overview.md:
##########
@@ -0,0 +1,154 @@
+---
+id: schema-overview
+title: Overview
+sidebar_label: "Overview"
+---
+
+This section introduces the following content:
+* [What is Pulsar Schema](#what-is-pulsar-schema)
+* [Why use it](#why-use-it)
+* [How it works](#how-it-works)
+* [Use case](#use-case)
+* [What's next?](#whats-next)
+
+## What is Pulsar Schema
+
+Pulsar messages are stored as unstructured byte arrays and the data structure
(as known as schema) is applied to this data only when it's read. The schema
serializes the bytes before they are published to a topic and deserializes them
before they are delivered to the consumers, dictating which data types are
recognized as valid for a given topic.
+
+Pulsar schema registry is a central repository to store the schema
information, which enables producers/consumers to coordinate on the schema of a
topic’s data through brokers.
+
+:::note
+
+Currently, Pulsar schema is only available for the [Java
client](client-libraries-java.md), [Go client](client-libraries-go.md), [Python
client](client-libraries-python.md), and [C++ client](client-libraries-cpp.md).
+
+:::
+
+## Why use it
+
+Type safety is extremely important in any application built around a messaging
and streaming system. Raw bytes are flexible for data transfer, but the
flexibility and neutrality come with a cost: you have to overlay data type
checking and serialization/deserialization to ensure that the bytes fed into
the system can be read and successfully consumed. In other words, you need to
make sure the data intelligible and usable to applications.
+
+Pulsar schema resolves the pain points with the following capabilities:
+* enforces the data type safety when a topic has a schema defined. As a
result, producers/consumers are only allowed to connect if they are using a
“compatible” schema.
+* provides a central location for storing information about the schemas used
within your organization, in turn greatly simplifies the sharing of this
information across application teams.
+* serves as a single source of truth for all the message schemas used across
all your services and development teams, which makes it easier for them to
collaborate.
+* keeps data compatibility on-track between schema versions. When new schemas
are uploaded, the new versions can be read by old consumers.
+* stored in the existing storage layer BookKeeper, no additional system
required.
+
+## How it works
Review Comment:
whether move to `understand schema`, the user may not understand how it
works here, and may not care about it in the Overview
##########
site2/docs/schema-understand.md:
##########
@@ -121,109 +81,15 @@ Currently, Pulsar supports the following complex types:
| `keyvalue` | Represents a complex type of a key/value pair. |
| `struct` | Handles structured data. It supports `AvroBaseStructSchema` and
`ProtobufNativeSchema`. |
-#### keyvalue
-
-`Keyvalue` schema helps applications define schemas for both key and value.
+#### `keyvalue` schema
-For `SchemaInfo` of `keyvalue` schema, Pulsar stores the `SchemaInfo` of key
schema and the `SchemaInfo` of value schema together.
+`Keyvalue` schema helps applications define schemas for both key and value.
Pulsar stores the `SchemaInfo` of key schema and the `SchemaInfo` of value
schema together.
-Pulsar provides the following methods to encode a key/value pair in messages:
+You can choose the encoding type when constructing the key/value schema.:
+* `INLINE` - Key/value pairs are encoded together in the message payload.
+* `SEPARATED` - see [Construct a key/value
schema](schema-get-started.md#construct-a-keyvalue-schema).
-* `INLINE`
-
-* `SEPARATED`
-
-You can choose the encoding type when constructing the key/value schema.
-
-````mdx-code-block
-<Tabs
- defaultValue="INLINE"
-
values={[{"label":"INLINE","value":"INLINE"},{"label":"SEPARATED","value":"SEPARATED"}]}>
-
-<TabItem value="INLINE">
-
-Key/value pairs are encoded together in the message payload.
-
-</TabItem>
-<TabItem value="SEPARATED">
-
-Key is encoded in the message key and the value is encoded in the message
payload.
-
-**Example**
-
-This example shows how to construct a key/value schema and then use it to
produce and consume messages.
-
-1. Construct a key/value schema with `INLINE` encoding type.
-
- ```java
- Schema<KeyValue<Integer, String>> kvSchema = Schema.KeyValue(
- Schema.INT32,
- Schema.STRING,
- KeyValueEncodingType.INLINE
- );
- ```
-
-2. Optionally, construct a key/value schema with `SEPARATED` encoding type.
-
- ```java
- Schema<KeyValue<Integer, String>> kvSchema = Schema.KeyValue(
- Schema.INT32,
- Schema.STRING,
- KeyValueEncodingType.SEPARATED
- );
- ```
-
-3. Produce messages using a key/value schema.
-
- ```java
- Schema<KeyValue<Integer, String>> kvSchema = Schema.KeyValue(
- Schema.INT32,
- Schema.STRING,
- KeyValueEncodingType.SEPARATED
- );
-
- Producer<KeyValue<Integer, String>> producer = client.newProducer(kvSchema)
- .topic(TOPIC)
- .create();
-
- final int key = 100;
- final String value = "value-100";
-
- // send the key/value message
- producer.newMessage()
- .value(new KeyValue(key, value))
- .send();
- ```
-
-4. Consume messages using a key/value schema.
-
- ```java
- Schema<KeyValue<Integer, String>> kvSchema = Schema.KeyValue(
- Schema.INT32,
- Schema.STRING,
- KeyValueEncodingType.SEPARATED
- );
-
- Consumer<KeyValue<Integer, String>> consumer = client.newConsumer(kvSchema)
- ...
- .topic(TOPIC)
- .subscriptionName(SubscriptionName).subscribe();
-
- // receive key/value pair
- Message<KeyValue<Integer, String>> msg = consumer.receive();
- KeyValue<Integer, String> kv = msg.getValue();
- ```
-
-</TabItem>
-
-</Tabs>
-````
-
-#### struct
-
-This section describes the details of type and usage of the `struct` schema.
-
-##### Type
+#### `struct` schema
`struct` schema supports `AvroBaseStructSchema` and `ProtobufNativeSchema`.
Review Comment:
`AvroSchema` `JsonSchema` etc. is struct schema, later we should add these
##########
site2/docs/schema-evolution-compatibility.md:
##########
@@ -6,29 +6,21 @@ sidebar_label: "Schema evolution and compatibility"
Normally, schemas do not stay the same over a long period of time. Instead,
they undergo evolutions to satisfy new needs.
-This chapter examines how Pulsar schema evolves and what Pulsar schema
compatibility check strategies are.
+This chapter introduces how Pulsar schema evolves and what compatibility check
strategies it adopts.
## Schema evolution
Review Comment:
can we move this to `Understand Schema`? It feels awkward to put it here,
the user must have enough context to understand it
##########
site2/docs/schema-understand.md:
##########
@@ -121,109 +81,15 @@ Currently, Pulsar supports the following complex types:
| `keyvalue` | Represents a complex type of a key/value pair. |
| `struct` | Handles structured data. It supports `AvroBaseStructSchema` and
`ProtobufNativeSchema`. |
-#### keyvalue
-
-`Keyvalue` schema helps applications define schemas for both key and value.
+#### `keyvalue` schema
-For `SchemaInfo` of `keyvalue` schema, Pulsar stores the `SchemaInfo` of key
schema and the `SchemaInfo` of value schema together.
+`Keyvalue` schema helps applications define schemas for both key and value.
Pulsar stores the `SchemaInfo` of key schema and the `SchemaInfo` of value
schema together.
-Pulsar provides the following methods to encode a key/value pair in messages:
+You can choose the encoding type when constructing the key/value schema.:
+* `INLINE` - Key/value pairs are encoded together in the message payload.
+* `SEPARATED` - see [Construct a key/value
schema](schema-get-started.md#construct-a-keyvalue-schema).
-* `INLINE`
-
-* `SEPARATED`
-
-You can choose the encoding type when constructing the key/value schema.
-
-````mdx-code-block
-<Tabs
- defaultValue="INLINE"
-
values={[{"label":"INLINE","value":"INLINE"},{"label":"SEPARATED","value":"SEPARATED"}]}>
-
-<TabItem value="INLINE">
-
-Key/value pairs are encoded together in the message payload.
-
-</TabItem>
-<TabItem value="SEPARATED">
-
-Key is encoded in the message key and the value is encoded in the message
payload.
-
-**Example**
-
-This example shows how to construct a key/value schema and then use it to
produce and consume messages.
-
-1. Construct a key/value schema with `INLINE` encoding type.
-
- ```java
- Schema<KeyValue<Integer, String>> kvSchema = Schema.KeyValue(
- Schema.INT32,
- Schema.STRING,
- KeyValueEncodingType.INLINE
- );
- ```
-
-2. Optionally, construct a key/value schema with `SEPARATED` encoding type.
-
- ```java
- Schema<KeyValue<Integer, String>> kvSchema = Schema.KeyValue(
- Schema.INT32,
- Schema.STRING,
- KeyValueEncodingType.SEPARATED
- );
- ```
-
-3. Produce messages using a key/value schema.
-
- ```java
- Schema<KeyValue<Integer, String>> kvSchema = Schema.KeyValue(
- Schema.INT32,
- Schema.STRING,
- KeyValueEncodingType.SEPARATED
- );
-
- Producer<KeyValue<Integer, String>> producer = client.newProducer(kvSchema)
- .topic(TOPIC)
- .create();
-
- final int key = 100;
- final String value = "value-100";
-
- // send the key/value message
- producer.newMessage()
- .value(new KeyValue(key, value))
- .send();
- ```
-
-4. Consume messages using a key/value schema.
-
- ```java
- Schema<KeyValue<Integer, String>> kvSchema = Schema.KeyValue(
- Schema.INT32,
- Schema.STRING,
- KeyValueEncodingType.SEPARATED
- );
-
- Consumer<KeyValue<Integer, String>> consumer = client.newConsumer(kvSchema)
- ...
- .topic(TOPIC)
- .subscriptionName(SubscriptionName).subscribe();
-
- // receive key/value pair
- Message<KeyValue<Integer, String>> msg = consumer.receive();
- KeyValue<Integer, String> kv = msg.getValue();
- ```
-
-</TabItem>
-
-</Tabs>
-````
-
-#### struct
-
-This section describes the details of type and usage of the `struct` schema.
-
-##### Type
+#### `struct` schema
Review Comment:
```suggestion
#### `Struct` schema
```
##########
site2/docs/schema-evolution-compatibility.md:
##########
@@ -6,29 +6,21 @@ sidebar_label: "Schema evolution and compatibility"
Normally, schemas do not stay the same over a long period of time. Instead,
they undergo evolutions to satisfy new needs.
-This chapter examines how Pulsar schema evolves and what Pulsar schema
compatibility check strategies are.
+This chapter introduces how Pulsar schema evolves and what compatibility check
strategies it adopts.
## Schema evolution
-Pulsar schema is defined in a data structure called `SchemaInfo`.
-
-Each `SchemaInfo` stored with a topic has a version. The version is used to
manage the schema changes happening within a topic.
-
The message produced with `SchemaInfo` is tagged with a schema version. When a
message is consumed by a Pulsar client, the Pulsar client can use the schema
version to retrieve the corresponding `SchemaInfo` and use the correct schema
information to deserialize data.
-### What is schema evolution?
-
Schemas store the details of attributes and types. To satisfy new business
requirements, you need to update schemas inevitably over time, which is called
**schema evolution**.
Any schema changes affect downstream consumers. Schema evolution ensures that
the downstream consumers can seamlessly handle data encoded with both old
schemas and new schemas.
-### How Pulsar schema should evolve?
-
-The answer is Pulsar schema compatibility check strategy. It determines how
schema compares old schemas with new schemas in topics.
+### How schema evolves?
-For more information, see [Schema compatibility check
strategy](#schema-compatibility-check-strategy).
+The answer is [schema compatibility check
strategy](#schema-compatibility-check-strategy). It determines how schema
compares old schemas with new schemas in topics.
Review Comment:
Can we delete it? doesn't seem to say anything
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]