Hi,

The goal of this email is to summarize and unify the discussion started across 
several email threads (Storm 2.0 
Roadmap<http://search-hadoop.com/?project=Storm&q=%22%5BDISCUSS%5D+Storm+2.0+Roadmap%22>,
 1.1.1 Release 
Planning<http://search-hadoop.com/m/Storm/8gnYyGagLDWv1qG?subj=Release+Planning+for+1+1+1+and+others+>,
 Lag 
Issues<http://search-hadoop.com/m/Storm/8gnYyLmjIjYr692?subj=Lag+issues+using+Storm+1+1+1+latest+build+with+StormKafkaClient+1+1+1+vs+old+StormKafka+spouts>)
 concerning the maintenance, branch support, and eventual deprecation of 
storm-kafka and storm-kafka-client.

It was proposed in an earlier 
discussion<http://search-hadoop.com/?project=Storm&q=%22%5BDISCUSS%5D+Storm+2.0+Roadmap%22>
 the plan to deprecate storm-kafka in prol of storm-kafka-client. To clarify, 
the idea is not to completely eliminate storm-kafka, but rather keep supporting 
it in the 1.x-branch, while removing it from master (i.e. Storm 2.0 onwards). 
That is, storm-kafka-client will then become the only Storm Kafka option 
available for Storm 2.0 onwards, given that we have enough confidence in its 
stability by the time of the Storm 2.0 release.

The main reason for this proposal is the fact that the Kafka community 
agreed<https://cwiki.apache.org/confluence/display/KAFKA/KIP-109:+Old+Consumer+Deprecation>
 to deprecate the old consumer APIs starting in version 0.10.2, and will remove 
them in the next major version (0.12). This implies that storm-kafka will not 
work for Kafka 0.12 onwards. Important features missing in the old Kafka 
consumer are: security, new message format, and fetching offsets based on time 
stamp (KIP-79).

In earlier discussions the Storm community has shown concerns about the 
performance and stability of the storm-kafka-client. Those concerns are valid 
and were mirrored by the Kafka community in their early deprecation 
discussions. I align with what was said in the Kafka 
discussion<http://search-hadoop.com/m/Kafka/uyzND1e4bUP1Rjq721>: the 
storm-kafka-client has bugs, but so does storm-kafka, and all the development 
is currently going into storm-kafka-client, which will be even more prevalent 
in face of Kafka discontinuing the old consumer API’s. The only way to 
stabilize a complex component such as storm-kafka-client is to test it 
extensively in all its variants, which inevitably comes from users using it. 
Furthermore, removing storm-kafka from Storm 2.0 does not prevent users from 
still referring to storm-kafka version 1.x in their topologies.

I did a quick analysis of the JIRA issues for storm-kafka and 
storm-kafka-client [1].  As of July 11 there are 22 open or in-progress bugs 
for storm-kafka (1 blocker) and 15 for storm-kafka-client.

The recent refactoring around manual partition assignment should solve a lot of 
edge case bugs that occurred during rebalance. There are also a few open pull 
requests for Trident  and fixing some internal state details such as 
maxUncommittedOffsets, topic compaction, etc. Nevertheless, there are several 
areas that need to be addressed to stabilize and improve storm-kafka-client. 
Similarly to what was done for Storm SQL I suggest that we create a wiki page 
where we can centralize some points of action such as:

Features / Stability
 * Memory Footprint
 * Retrial Mechanism
 * Exactly once and at least once guarantees
 * Kafka Lag
 * Metrics
 * Spout Internals (e.g. maxUncommittedOffsets, ack, emitted, failed, ...)
 * Autocommit mode

Performance.
 * Run performance benchmarks

Integration Testing
* Test for exactly once in non failure scenarios (e.g. activate/deactivate)
* Test for at least once in failure scenarios
* Test Trident guarantees

Unit Testing
 * Identify unit test coverage and find a modular way to continually add new 
tests

Trident
  * Pull request<https://github.com/apache/storm/pull/2174> for review

API
  * Investigate for gaps in API between storm-kafka and storm-kafka-client.
  * Can we discontinue the old API ?

Documentation
  * Check for accuracy and completeness of documentation
  * Make clean code snippets with examples available

[1] - The data was extracted from JIRA on 07/11/2017. The storm-kafka-client 
JIRAs were checked for correctness of component label, and had their status 
updated. None of that was done for the storm-kafka JIRAs, therefore some of its 
issues marked as open may already have been fixed. The results and charts can 
be found here:
    * 
storm-kafka-jiras<https://docs.google.com/spreadsheets/d/1pdqAKDtqfhPrfgFxnQa4bSrKP1YBdMyuGzqr3gLzcMA/edit?usp=sharing>
    * 
storm-kafka-client-jiras<https://docs.google.com/spreadsheets/d/12g0HLz4pgODMVVOmzvti1nzLOa6iygmk8pyTOv8op1c/edit?usp=sharing>

Reply via email to