I guess one way to do this is to use RANK twice, once on the original relationship, and once on the original relationship \ the first point. Then join on the rank and subtract.
A = load 'data'; B = filter A by timestamp > 20141014120523; -- remove the first point C= RANK A by timestamp; D= RANK B by timestamp; E = JOIN C by $0; D by $0; -- join on the rank F = foreach E generate C.timestamp - D.timestamp' Disclaimer: the script is just off the top of my head and is not tested. Cheers, -- Gianmarco On 8 October 2014 09:01, Krishna Kalyan <[email protected]> wrote: > Hi Everybody, > > Input File : Records are sorted based on the time stamp > Expected input file size will be :2-3TB > > timestamp > ============== > 20141014120523 > 20141014120534 > 20141014120537 > 20141014120542 > 20141014120549 > 20141014120555 > 20141014120565 > 20141014120570 > 20141014120512 > ... > ... > > > Using PIG I need to find the time difference between the Nth record and > Nth-1 Record time stamp (20141014120534 - 20141014120523 = 11 secs). > I need to loop through all the records to get the time difference from > previous record > > Example Output > 0 > 11 > 3 > 5 > ... > > Please guide. > > Regards, > Krishna Kalyan >
