Hi,
I have a small question about a custom partitioners. I couldn't really find a method in the Java API which lets me partition the dataset in a total order. Is that something that I just overlooked, or is that something not really supported? The use case is writing out data so that it can be consumed by another tool. and also for quick human inspection I can easily apply the sorting on a hash partition, but I would loose the total order, that all keys in partition one come after or before partition two etc. PIG supports this by mapping the ORDER BY statement to 2 jobs. The first one samples the data to build a sketch to approximate the key distribution and then feds the result into the total order partitioner. Thanks Johannes