On Wed, Oct 7, 2009 at 10:33 AM, Sreejith K <[email protected]> wrote: > > Hi all, > > I'm looking for a way to use MapReduce with Hypertable using Hadoop > Streaming. How can I pass Hypertable data to the MapReduce framework > using > Python as HadoopStreaming only supports text input ?
Before the next release of Hypertable which will contain the ThriftInputFormat/ThriftOutputFormat classes (in hypertable*.jar), you'll have to write a custom InputFormat/OutputFormat yourself using the Thrift API open_scanner on METADATA table to create the proper InputSplits etc. If you don't want to do that. I suggest that you wait until (probably some time) next week :) > I also need to apply some filters (GQL like) to the Hypertable cells. How can > I pass these > filters to the mapper program ? Since the --mapper argument takes a command you can just use --mapper "my.py --filter f1,f2...". If you're thinking about the HQL where clause, you'll be able to specify -D map.hypertable.hql="select c1, c2 from sometable where ..." __Luke --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Hypertable Development" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/hypertable-dev?hl=en -~----------~----~----~----~------~----~------~--~---
