On Mon, 2009-02-23 at 11:14 +0000, Steve Loughran wrote: > Dumbo provides py support under Hadoop: > http://wiki.github.com/klbostee/dumbo > https://issues.apache.org/jira/browse/HADOOP-4304
Ooh, nice - I hadn't seen dumbo. That's far cleaner than the python wrapper to streaming I'd hacked together. I'm probably going to be using hadoop more again in the near future so I'll bookmark that, thanks Steve. Personally I only need text based records, so I'm fine using a wrapper around streaming Tim Wintle