Hi,
I have following scenario in my project;
1. I will continue to get a stream of data from a source
2. I need to calculate mean and variance for a key every minute
3. After minute is over I should restart fresh computing the values for new
minute
Example:
10:00:00 computation and output
10:00:00 key =1 , mean = 10 , variance =2
10:00:00 key =N , mean = 10 , variance =2
10:00:01 computation and output
10:00:00 key =1 , mean = 11 , variance =2
10:00:00 key =N , mean = 12 , variance =2
10:00:01 data has no dependency with 10:00:00
How to setup such jobs in a single java spark streaming application.
Regards,|
Santosh Akhilesh