GitHub user mjtieman opened a pull request:

    https://github.com/apache/storm/pull/454

    Storm 697: Support for Emitting Kafka Message Offset and Partition

    It would be nice expose the offset and partition of messages consumed from 
Kafka to the Scheme generating the Tuples. This is useful for 
auditing/replaying data from arbitrary points on a Kafka topic, saving the 
partition and offset of each message of a discrete stream instead of persisting 
the entire message.
    
    * Added new sheme to that accepts a Partition and the message offset in its 
deserialization method.
    * Defined an overload of `KafkaUtils.generateTuples` to accept a 
`Partition` and offset in addition to the message byte[].
    * Added a flag in `KafkaConfig` to indicate if the metadata, partition and 
offset, should be available during tuple generation.
    * Wrote a simple String implementation of the new scheme, 
`StringMessageAndMetadataScheme`, following the same pattern as 
`StringKeyValueScheme`.
    * Unit tests.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mjtieman/storm STORM-697

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/storm/pull/454.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #454
    
----
commit 5b4c28a088ffc62ebcc28e8c28a25d096aa1eb78
Author: matt.tieman <matt.tie...@inin.com>
Date:   2015-03-03T16:46:30Z

    STORM-697: Added tupleMetaData flag

commit 6e4fde20af8d285cdf4829e4c2c4aef4cd45d89d
Author: matt.tieman <matt.tie...@inin.com>
Date:   2015-03-03T16:47:38Z

    STORM-697: Overload of generateTuples to accept the Partition and offset

commit 6e768665320d08815c53f27e706ef2ae1ff5af78
Author: matt.tieman <matt.tie...@inin.com>
Date:   2015-03-03T16:48:57Z

    STORM-697: Test for MessageMetadataSchemeAsMultiScheme and generateTuples 
with metadata using SchemeAsMultiScheme

commit 2f119c6e2edace030afeb9ee0885010f1de7fc28
Author: matt.tieman <matt.tie...@inin.com>
Date:   2015-03-03T16:50:04Z

    STORM-697: Added scheme to include Partition and offset when generating 
tuple. >>>
    
    The MessageMetadataScheme interface extends Sheme and defines a 
deserialization method that accepts the message byte[], Partition, and the 
offset.
    MessageMetadataSchemeAsMultiScheme follows the same pattern as 
KeyValueSchemeAsMultiScheme, extending SchemeAsMultiScheme and providing a 
deserialization method named for the method defined by
    MessageMetadataScheme.
    StringMessageAndMetadataScheme provides an implementation of 
MessageMetadataScheme, following the same pattern as StringKeyValueScheme.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to