Struggling time by data

2015-12-25 Thread Yasemin Kaya
hi,

I have struggled this data couple of days, i cant find solution. Could you
help me?

*DATA:*
*(userid1_time, url) *
*(userid1_time2, url2)*


I want to get url which are in 30 min.

*RESULT:*
*If time2-time1<30 min*
*(user1, [url1, url2] )*

Best,
yasemin
-- 
hiç ender hiç


Re: Struggling time by data

2015-12-25 Thread Xingchi Wang
map{case(x, y) => s = x.split("_"), (s(0), (s(1),
y)))}.groupByKey().filter{case (_, (a, b)) => abs(a._1, a._1) < 30min}

does it work for you ?

2015-12-25 16:53 GMT+08:00 Yasemin Kaya :

> hi,
>
> I have struggled this data couple of days, i cant find solution. Could you
> help me?
>
> *DATA:*
> *(userid1_time, url) *
> *(userid1_time2, url2)*
>
>
> I want to get url which are in 30 min.
>
> *RESULT:*
> *If time2-time1<30 min*
> *(user1, [url1, url2] )*
>
> Best,
> yasemin
> --
> hiç ender hiç
>


Re: Struggling time by data

2015-12-25 Thread Yasemin Kaya
it is ok but . I want to categorize the urls by sessions actually.

*DATA:* (sorted by time)
*(userid1_time, url1) *
*(userid1_time2, url2)*
*(userid1_time3, url3) *
*(userid1_time4, url4)*

*RESULT: *
*url1 *already added to* session1*
*time2-time1 < 30 min *so* url2 *go to* session1*
*time3-time2 > 30 min *so* url3 *goes to* session2*
*time4-time3 <30 min *so *url4* goes to* session3*

*(user1, [url1, url2] [url3,url4])*

Does your solution fit my problem?

2015-12-25 12:23 GMT+02:00 Xingchi Wang :

> map{case(x, y) => s = x.split("_"), (s(0), (s(1),
> y)))}.groupByKey().filter{case (_, (a, b)) => abs(a._1, a._1) < 30min}
>
> does it work for you ?
>
> 2015-12-25 16:53 GMT+08:00 Yasemin Kaya :
>
>> hi,
>>
>> I have struggled this data couple of days, i cant find solution. Could
>> you help me?
>>
>> *DATA:*
>> *(userid1_time, url) *
>> *(userid1_time2, url2)*
>>
>>
>> I want to get url which are in 30 min.
>>
>> *RESULT:*
>> *If time2-time1<30 min*
>> *(user1, [url1, url2] )*
>>
>> Best,
>> yasemin
>> --
>> hiç ender hiç
>>
>
>


-- 
hiç ender hiç