This is an automated email from the ASF dual-hosted git repository.
bbejeck pushed a commit to branch 1.1
in repository https://gitbox.apache.org/repos/asf/kafka.git
The following commit(s) were added to refs/heads/1.1 by this push:
new 92f4e4e port paragrpah from CP docs (#7808)
92f4e4e is described below
commit 92f4e4ebba0d05f7349b6a03d114ec12ea428307
Author: A. Sophie Blee-Goldman <[email protected]>
AuthorDate: Mon Dec 9 13:35:17 2019 -0800
port paragrpah from CP docs (#7808)
The AK Streams architecture docs should explain how the maximum parallelism
is determined
Reviewers: Bill Bejeck <[email protected]>
---
docs/streams/architecture.html | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/docs/streams/architecture.html b/docs/streams/architecture.html
index 8bc3156..7efd7ea 100644
--- a/docs/streams/architecture.html
+++ b/docs/streams/architecture.html
@@ -66,6 +66,14 @@
</p>
<p>
+ Slightly simplified, the maximum parallelism at which your application
may run is bounded by the maximum number of stream tasks, which itself is
determined by
+ maximum number of partitions of the input topic(s) the application is
reading from. For example, if your input topic has 5 partitions, then you can
run up to 5
+ applications instances. These instances will collaboratively process
the topic’s data. If you run a larger number of app instances than partitions
of the input
+ topic, the “excess” app instances will launch but remain idle;
however, if one of the busy instances goes down, one of the idle instances will
resume the former’s
+ work.
+ </p>
+
+ <p>
It is important to understand that Kafka Streams is not a resource
manager, but a library that "runs" anywhere its stream processing application
runs.
Multiple instances of the application are executed either on the same
machine, or spread across multiple machines and tasks can be distributed
automatically
by the library to those running application instances. The assignment
of partitions to tasks never changes; if an application instance fails, all its
assigned