Github user kl0u commented on the issue:

    https://github.com/apache/flink/pull/4172
  
    @dianfu and I also include @wuchong on this as these two are related. 
    
    The way I see it is that by not serializing the condition and the states, 
you are trying to gain some speed, especially when using RocksDB where you 
serialize/deserialize on every element, right? My suggestion is to not do these 
optimizations yet.
    
    At first, because this seems like a pre-mature optimization to me as we are 
not sure yet about the interplay between all the features we are planning to 
put in `CEP`, and we know that if we allow users to add `Patterns` at runtime, 
then we will need 1) to store both States and Conditions and 2) match the 
States and Conditions of a given NFA with its SharredBuffer. In other words, we 
will need a unique Id for each NFA, that will match the restored sharedBuffer 
(which is still serialized and deserialized as before) with the States 
(`metastates` in this PR) and Conditions (`ConditionRegistry` in 
https://github.com/apache/flink/pull/4145) of the NFA. 
    
    So I propose to implement this 
https://issues.apache.org/jira/browse/FLINK-7008 and 
https://issues.apache.org/jira/browse/FLINK-6938 right away so that we can 
proceed with the SQL integration, and think a more general solution for 
checkpointing separately the static state of the NFA (state and conditions) 
from the dynamic one (sharedbuffer), which will lead to runtime gains.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to