Hi
 
I have some questions regarding Spark Streaming.
 
I receive a stream of JSON messages from Kafka.
The messages consist of a timestamp and an ID.
 
timestamp                 ID
2016-12-06 13:00    1
2016-12-06 13:40    5
...
 
In a database I have values for each ID:
 
ID       minute      value
1         0               3
1         1               5
1         2               7
1         3               8
5         0               6
5         1               6
5         2               8
5         3               5
5         4               6
 
So I would like to join each incoming JSON message with the corresponding values. It should look as follows:
 
timestamp                 ID           minute      value
2016-12-06 13:00    1             0               3
2016-12-06 13:00    1             1               5          
2016-12-06 13:00    1             2               7
2016-12-06 13:00    1             3               8
2016-12-06 13:40    5             0               6
2016-12-06 13:40    5             1               6
2016-12-06 13:40    5             2               8
2016-12-06 13:40    5             3               5
2016-12-06 13:40    5             4               6
...
 
Then I would like to add the minute values to the timestamp. I only need the computed timestamp and the values. So the result should look as follows:
 
timestamp                   value
2016-12-06 13:00      3
2016-12-06 13:01      5          
2016-12-06 13:02      7
2016-12-06 13:03      8
2016-12-06 13:40      6
2016-12-06 13:41      6
2016-12-06 13:42      8
2016-12-06 13:43      5
2016-12-06 13:44      6
...
 
Is this a possible use case for Spark Streaming? I thought I could join the streaming data with the static data but I am not sure how to add the minute values to the timestamp. Is this possible with Spark Streaming?
 
Thank you in advance.
 
Best regards,
Daniela
 
--------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to