Hey James, Have you looked at linkedIn's collection of UDFs, datafu ( http://engineering.linkedin.com/open-source/introducing-datafu-open-source-collection-useful-apache-pig-udfs )?
In particular, they have a UDF called BagSplit ( https://github.com/linkedin/datafu/blob/master/src/java/datafu/pig/bags/BagSplit.java). It might not do exactly what you want since it splits a bag into bags of size n, not into 10 equal-sized bags, but it shouldn't be too hard to write your own UDF using BagSplit.java as a reference. Dan F. On Wed, Apr 11, 2012 at 8:53 AM, James Newhaven <[email protected]>wrote: > Hi, > > I need to divide a large bag into 10 smaller bags of equal size. Does > anyone know of a function that can do this easily? I've had a look at the > standard functions and the PiggyBank and can't find anything appropriate. > > Thanks, > James >
