This is somewhat the continuation of my thread "Writing Out List<String>."

Right now, the only way to do sorting is with the Top class. This works
well, but has the constraint of fitting in memory.

A common batch use case is to take a large file and sort it. For example,
this would be sorting a large report (several GB) file by timestamp. As of
right now, this isn't built into Beam. I think it should be.

I'll hold out Crunch's Sort
<https://crunch.apache.org/apidocs/0.11.0/org/apache/crunch/lib/Sort.html>
class as an example of what this class could look like.

Thanks,

Jesse

Reply via email to