Hi

I have a dataset which has almost 99% of correct data. As of now if say
some data is bad I just ignore it and log it and return only correct data.
I do this inside a map function.

The part which decides whether data is correct or not is expensive one.

Now I want to store the bad data somewhere so that I could analyze that
data in future.

So I can run the same calc 2 times and get the correct data in first go and
bad data in 2nd go.

Is there a better way where I can somehow store the bad data from inside of
map function like send to kafka, file etc?

Also, is there a way I could create a datastream which can get the data
from inside map function(not sure this is feasible as of now)?

Thanks

Reply via email to