[DISCUSS] [Storm-Kafka] - Maintenance, Branch Support, and Deprecation Plans for storm-kafka and storm-kaka-client

Hugo Da Cruz Louro Wed, 19 Jul 2017 09:21:01 -0700

Hi,

The goal of this email is to summarize and unify the discussion started across 
several email threads (Storm 2.0 
Roadmap<http://search-hadoop.com/?project=Storm&q=%22%5BDISCUSS%5D+Storm+2.0+Roadmap%22>,
 1.1.1 Release 
Planning<http://search-hadoop.com/m/Storm/8gnYyGagLDWv1qG?subj=Release+Planning+for+1+1+1+and+others+>,
 Lag 
Issues<http://search-hadoop.com/m/Storm/8gnYyLmjIjYr692?subj=Lag+issues+using+Storm+1+1+1+latest+build+with+StormKafkaClient+1+1+1+vs+old+StormKafka+spouts>)
 concerning the maintenance, branch support, and eventual deprecation of 
storm-kafka and storm-kafka-client.

It was proposed in an earlier
discussion<http://search-hadoop.com/?project=Storm&q=%22%5BDISCUSS%5D+Storm+2.0+Roadmap%22>
the plan to deprecate storm-kafka in prol of storm-kafka-client. To clarify,
the idea is not to completely eliminate storm-kafka, but rather keep supporting
it in the 1.x-branch, while removing it from master (i.e. Storm 2.0 onwards).
That is, storm-kafka-client will then become the only Storm Kafka option
available for Storm 2.0 onwards, given that we have enough confidence in its
stability by the time of the Storm 2.0 release.

The main reason for this proposal is the fact that the Kafka community
agreed<https://cwiki.apache.org/confluence/display/KAFKA/KIP-109:+Old+Consumer+Deprecation>
to deprecate the old consumer APIs starting in version 0.10.2, and will remove
them in the next major version (0.12). This implies that storm-kafka will not
work for Kafka 0.12 onwards. Important features missing in the old Kafka
consumer are: security, new message format, and fetching offsets based on time
stamp (KIP-79).

In earlier discussions the Storm community has shown concerns about the
performance and stability of the storm-kafka-client. Those concerns are valid
and were mirrored by the Kafka community in their early deprecation
discussions. I align with what was said in the Kafka
discussion<http://search-hadoop.com/m/Kafka/uyzND1e4bUP1Rjq721>: the
storm-kafka-client has bugs, but so does storm-kafka, and all the development
is currently going into storm-kafka-client, which will be even more prevalent
in face of Kafka discontinuing the old consumer API’s. The only way to
stabilize a complex component such as storm-kafka-client is to test it
extensively in all its variants, which inevitably comes from users using it.
Furthermore, removing storm-kafka from Storm 2.0 does not prevent users from
still referring to storm-kafka version 1.x in their topologies.

I did a quick analysis of the JIRA issues for storm-kafka and
storm-kafka-client [1]. As of July 11 there are 22 open or in-progress bugs
for storm-kafka (1 blocker) and 15 for storm-kafka-client.

The recent refactoring around manual partition assignment should solve a lot of
edge case bugs that occurred during rebalance. There are also a few open pull
requests for Trident and fixing some internal state details such as
maxUncommittedOffsets, topic compaction, etc. Nevertheless, there are several
areas that need to be addressed to stabilize and improve storm-kafka-client.
Similarly to what was done for Storm SQL I suggest that we create a wiki page
where we can centralize some points of action such as:

Features / Stability
* Memory Footprint
* Retrial Mechanism
* Exactly once and at least once guarantees
* Kafka Lag
* Metrics
* Spout Internals (e.g. maxUncommittedOffsets, ack, emitted, failed, ...)
* Autocommit mode

Performance.
* Run performance benchmarks

Integration Testing
* Test for exactly once in non failure scenarios (e.g. activate/deactivate)
* Test for at least once in failure scenarios
* Test Trident guarantees

Unit Testing
* Identify unit test coverage and find a modular way to continually add new
tests

Trident
* Pull request<https://github.com/apache/storm/pull/2174> for review

API
* Investigate for gaps in API between storm-kafka and storm-kafka-client.
* Can we discontinue the old API ?

Documentation
* Check for accuracy and completeness of documentation
* Make clean code snippets with examples available

[1] - The data was extracted from JIRA on 07/11/2017. The storm-kafka-client
JIRAs were checked for correctness of component label, and had their status
updated. None of that was done for the storm-kafka JIRAs, therefore some of its
issues marked as open may already have been fixed. The results and charts can
be found here:
*
storm-kafka-jiras<https://docs.google.com/spreadsheets/d/1pdqAKDtqfhPrfgFxnQa4bSrKP1YBdMyuGzqr3gLzcMA/edit?usp=sharing>
*
storm-kafka-client-jiras<https://docs.google.com/spreadsheets/d/12g0HLz4pgODMVVOmzvti1nzLOa6iygmk8pyTOv8op1c/edit?usp=sharing>

[DISCUSS] [Storm-Kafka] - Maintenance, Branch Support, and Deprecation Plans for storm-kafka and storm-kaka-client

Reply via email to