do you mean every 7 mins?
e.g, [10:07, 10:14),
[10:14, 10:21) .
On 28 January 2013 12:56, Oleg Ruchovets wrote:
> Hi ,
> I have such row data structure:
>
> event_id | time
> ==
> event1 | 10:07
> event2 | 10:10
> event3 | 10:12
>
> event4 | 10:
Quick idea:
since each of your events will go into several buckets, you could use map() to
emit each item multiple times for each bucket.
Am 28.01.2013 um 13:56 schrieb Oleg Ruchovets :
> Hi ,
>I have such row data structure:
>
> event_id | time
> ==
> event1 | 10:07
>
Hi Kai.
It is very interesting. Can you please explain in more details your
Idea?
What will be a key in a map phase?
Suppose we have event at 10:07. How would you emit this to the multiple
buckets?
Thanks
Oleg.
On Mon, Jan 28, 2013 at 3:17 PM, Kai Voigt wrote:
> Quick idea:
>
> since each
Hi again,
the idea is that you emit every event multiple times. So your map input record
(event1, 10:07) will be emitted seven times during the map() call. Like I said,
(10:04,event1), (10:05,event1), ..., (10:10,event1) will be the seven outputs
for processing a single event.
The output key w
Hi , Zhiwei.
No :-). Every 7 minutes is is easy. just transform time to
milisecond/7*6 will give you a bucket key.
I need to do the following:
Find the events which was dirung time T related to the event X.
In very naive approach I need to take first event and find other events
which
Well , much more clear , but still have a questions :-)
Suppose we have 3 map input records
event1 | 10:07
event2 | 10:10
event3 | 10:12
Output from map(event1 | 10:07) will be :
mapOutput(10:04:event1)
mapOutput(10:05:event1)
mapOutput(10:06:event1)
mapOutput(10:07:event1)
mapOutput(10:08