Hi all, I have to handle high-speed rate data stream. To reduce the heavy load, I want to use sampling techniques for each stream window. It means that I want to process a subset of data instead of whole window data. I saw Spark support sampling operations for RDD, but for DStream, Spark supports sampling operation as well? If not, could you please give me a suggestion how to implement it?
Thanks, Martin