I am using PIG and this is what I am trying to do this: 1) Sort a relation A into B by a field x. The smallest value of x is first. Just use SORT.
2) Label each tuple in B with a number denoting its order in the sorted relation. So the first tuple would be labeled with a 1, the second tuple with a 2, the third with a 3 and so on. Not certain how to do this. 3) Derive a relation C where each row is a bag of tuples. The first row contains the first n1 tuples from relation B, the second row contains the tuples from B labeled (n1 + 1) to n2 from, the third row contains the tuples from B labeled (n2 + 1) to n3 and so on to n100. This step is simple (just use filter) once we've labeled each tuple in B with a number. The question: how do I do step 2)? thanks