Hi I was looking at the new full outer join. This seems to be working fine for my use case however I have a question regarding the state size.
I have 2 streams each will have 100's of million unique keys. Also, Each of these will get the updated value of keys 100's of times per day. As per my understanding in full outer join flink will keep all the values of the keys which it has seen in the state and whenever a new value comes from 1 of the stream. It will be joined against all of the key values which were there for 2nd stream.It could be 1 or 100's of rows. This seems inefficient but my question is more on the state side. Thus, I will need to keep billion's of values in state on both side. This will be very expensive. It is a non windowed join. A key can recieve updates for 50-60 days and after that it wont get any updates on any of the streams. Is there a way we could use a state such that only 1 value per key is retained in the state to reduce the size of the state? I am using the Table API but could use the Datastream api if needed. Thanks