ivankelly commented on a change in pull request #1466: Topic compaction 
documentation
URL: https://github.com/apache/incubator-pulsar/pull/1466#discussion_r190841826
 
 

 ##########
 File path: site/docs/latest/getting-started/ConceptsAndArchitecture.md
 ##########
 @@ -522,18 +541,55 @@ while (true) {
 To create a reader that will read from the latest available message:
 
 ```java
-MessageId id = MessageId.latest;
-Reader reader = pulsarClient.createReader(topic, id, new 
ReaderConfiguration());
+Reader<byte[]> reader = pulsarClient.newReader()
+    .topic(topic)
+    .startMessageId(MessageId.latest)
+    .create();
 ```
 
 To create a reader that will read from some message between earliest and 
latest:
 
 ```java
 byte[] msgIdBytes = // Some byte array
 MessageId id = MessageId.fromByteArray(msgIdBytes);
-Reader reader = pulsarClient.createReader(topic, id, new 
ReaderConfiguration());
+Reader<byte[]> reader = pulsarClient.newReader()
+    .topic(topic)
+    .startMessageId(id)
+    .create();
 ```
 
+## Topic compaction {#compaction}
+
+Pulsar was built with highly scalable [persistent 
storage](#persistent-storage) of message data as a primary objective. Pulsar {% 
popover topics %} enable you to persistently store as many unacknowledged 
messages as you need while preserving message ordering. By default, Pulsar 
stores *all* unacknowledged/unprocessed messages produced on a topic. 
Accumulating many unacknowledged messages on a topic is necessary for many 
Pulsar use cases but it can also be very time intensive for Pulsar {% popover 
consumers %} to "rewind" through the entire log of messages.
+
+{% include admonition.html type="success" content="For a more practical guide 
to topic compaction, see the [Topic compaction 
cookbook](../../cookbooks/compaction)." %}
+
+For some use cases, however, consumers don't need a complete "image" of the 
topic log. They may only need a few values to construct a more "shallow" image 
of the log, perhaps even just the most recent value. For these kinds of use 
cases Pulsar offers **topic compaction**. When you run compaction on a topic, 
Pulsar goes through a topic's backlog and removes messages that are *obscured* 
by later messages, i.e. it goes through the topic on a per-key basis and leaves 
only the most recent message associated with that key.
+
+Pulsar's topic compaction feature:
+
+* Can help preserve disk space and allow for much more efficient "rewind" of 
topic logs
+* Applies only to [persistent topics](#persistent-storage)
+* Is triggered manually via the command line. See the [Topic compaction 
cookbook](../../cookbooks/compaction)
+* Is conceptually and operationally distinct from [retention and 
expiry](#message-retention-and-expiry)
 
 Review comment:
   However, it does respect retention. So if retention has removed a message 
from the message backlog, this message will also be removed from the compacted 
topic ledger (or rather, it won't be readable).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to