Hi, My current application makes use of a DynamoDB database too map a key to a value. As each record enters the system the async-io calls this db and requests a value for the key but if that value doesn't exist a new value is generated and inserted. I have managed to do all this in one update operation to the dynamodb so performance isn't too bad. This is usable for our current load, but our load will increase considerably in the near future and as writes are expensive (each update even if it actually returns the existing value is classed as a write) this could be a cost factor going forward.
Looking at broadcast state seems like it might be the answer. DynamoDB allows 'streams' of table modification events to be output to what is essentially a kinesis stream, so it might be possible to avoid the majority of write calls by storing local copies of the mapping. I should also point out that these mappings are essentially capped. The majority of events that come through will have an existing mapping. My idea is to try the following: 1. Application startup request the entire dataset from the DB (this is ~5m key:value pairs) 2. Inject this data into flink state somehow, possibly via broadcast state? 3. Subscribe to the DyanmoDB stream via broadcast state to capture updates to this table and update the flink state 4. When a record is processed, check flink state for existing mapping and proceed if found. If not, then AsyncIO process as before to generate a new mapping 5. DynamoDB writes the new value to the stream so all operators get the new value via broadcast state Is this idea workable? I am unsure about the initial DB fetch and the AsyncIO process should a new value need to be inserted. Any thoughts appreciated. Thanks O