You would have to write a udf that takes the bag and calculates what you want. 
I'd use the accumuator interface. A bit annoying to have to learn at first, but 
worth it as it will turn pig from useful to very powerful.

Your implementation would be quite similar to the accumulator implementation of 
max, except that the updating conditions would be trickier.

One nice thing about the udf is that it could very easily handle a file that is 
"stock price.    Date    price."

Sent via BlackBerry

-----Original Message-----
From: Todd Lee <[email protected]>
Date: Sat, 15 Jan 2011 01:52:56 
To: <[email protected]>
Reply-To: [email protected]
Subject: Loop through records row by row?

Hi,

Newbie here. So let's say I have a file which contains the closing market
price of a stock in 2010. i.e.

<Date>, <Price>
===================
2010-1-1, 10.1
2010-1-2, 10.2
2010-1-3, 9.9
2010-1-4, 10.0
2010-1-7, 11.0
...

and all I want is to find out the max number of consecutive days in which
the stock has been in a UP trend. (for the above example, the result should
be 3) It is fairly simple to solve in other programming languages using a
for-loop and a couple of temp variables, but is this possible to do in Pig?
the dataset is pretty big.

None of the examples and tutorials I found online had this kind of data
relationship between rows so I really could use your help.

Thanks a lot,
T

Reply via email to