Hi I have a dataset which has almost 99% of correct data. As of now if say some data is bad I just ignore it and log it and return only correct data. I do this inside a map function.
The part which decides whether data is correct or not is expensive one. Now I want to store the bad data somewhere so that I could analyze that data in future. So I can run the same calc 2 times and get the correct data in first go and bad data in 2nd go. Is there a better way where I can somehow store the bad data from inside of map function like send to kafka, file etc? Also, is there a way I could create a datastream which can get the data from inside map function(not sure this is feasible as of now)? Thanks