Hi All, I am working on adding a De-duplication operator in Malhar library based on managed state APIs. I will be working off the already created JIRA - https://issues.apache.org/jira/browse/APEXMALHAR-1701 and the initial pull request for an AbstractDeduper here: https://github.com/apache/apex-malhar/pull/260/files
I am planning to include the following features in the first version: 1. Time based de-duplication. Assumption: Tuple_Key -> Tuple_Time correlation holds. 2. Option to maintain order of incoming tuples. 3. Duplicate and Expired ports to emit duplicate and expired tuples respectively. Thanks. ~ Bhupesh
