[GitHub] storm pull request #1760: Add topology stream-awareness to storm-redis

mo-getter Thu, 03 Nov 2016 23:49:04 -0700

Github user mo-getter commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1760#discussion_r86495472
  
    --- Diff: 
external/storm-redis/src/main/java/org/apache/storm/redis/bolt/RedisFilterBolt.java
 ---
    @@ -45,18 +48,41 @@
      */
     public class RedisFilterBolt extends AbstractRedisBolt {
         private final RedisFilterMapper filterMapper;
    +    private final StreamMapper streamMapper;
         private final RedisDataTypeDescription.RedisDataType dataType;
         private final String additionalKey;
     
         /**
    -     * Constructor for single Redis environment (JedisPool)
    +     * Constructor for single Redis environment (JedisPool).
    +     * Tuples will be emitted to Storm's default streamId.
          * @param config configuration for initializing JedisPool
          * @param filterMapper mapper containing which datatype, query key 
that Bolt uses
          */
         public RedisFilterBolt(JedisPoolConfig config, RedisFilterMapper 
filterMapper) {
    --- End diff --
    
    Thanks for taking a look at this. I see your point, and I agree it makes 
the most sense for both to be done in the same interface. TL;DR is at the 
bottom ;-)
    
    The same problem also applies to RedisLookupMapper, which also defines 
`declareOutputFields` separately. I just came across 
[STORM-1953](https://issues.apache.org/jira/browse/STORM-1953).
    
    About your first option, would it make more sense for a FilterMapper to be 
a StreamMapper, rather than a StreamMapper be a FilterMapper? I'm afraid that 
making StreamMapper a FilterMapper would introduce too much ambiguity in the 
lookup and filter bolts (if the constructors accepted both), since we'd have to 
rely on only the docs to define which object's `declareOutputFields` would be 
called. It also would make STORM-1953 worse. Or did you mean the bolts only 
accept FilterMapper, and have something like this in `execute`:
    ```
    if (filterMapper instanceof StreamMapper) {
        String streamId = ((StreamMapper) filterMapper).getStreamId(input, 
value);
        collector.emit(streamId, input, value);
    } else {
        collector.emit(input, value);
    }
    ```
    Either way, if you combine them, one downside is that the provided 
convenience StreamMapper implementations would have to be sacrificed. Making 
them abstract probably wouldn't be worth it for something like just specifying 
the stream.
    
    In case you want to see what having FilterMapper and LookupMapper also 
extend StreamMapper looks like, I implemented that in a branch 
[here](https://github.com/apache/storm/compare/master...mo-getter:stream-mapper).
 The flexibility to dynamically choose a stream is there, but the problem is 
that the trident-related classes also use LookupMapper, and have no need to 
declare a streamId, yet users will have to implement this method in their 
LookupMappers. Just returning null is one [not so good] option here, and is 
also an option when using LookupMapper for bolts (in which case, the existing 
behavior of emitting to the default stream is maintained).
    
    **TL;DR:** I can't think of a great solution for what you mentioned, while 
maintaining user-friendliness of the API, without totally redoing the Mapper 
interfaces, i.e. 
[STORM-1953](https://issues.apache.org/jira/browse/STORM-1953). On the other 
hand, the above commit does maintain full backward compatibility and is 
probably most convenient for users!



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1760: Add topology stream-awareness to storm-redis

Reply via email to