On Wed, Oct 7, 2009 at 10:33 AM, Sreejith K <[email protected]> wrote:
>
> Hi all,
>
> I'm looking for a way to use MapReduce with Hypertable using Hadoop
> Streaming. How can I pass Hypertable data to the MapReduce framework
> using
> Python as HadoopStreaming only supports text input ?

Before the next release of Hypertable which will contain the
ThriftInputFormat/ThriftOutputFormat classes (in hypertable*.jar),
you'll have to write a custom InputFormat/OutputFormat yourself using
the Thrift API open_scanner on METADATA table to create the proper
InputSplits etc. If you don't want to do that. I suggest that you wait
until (probably some time) next week :)

> I also need to apply some filters (GQL like) to the Hypertable cells. How can 
> I pass these
> filters to the mapper program ?

Since the --mapper argument takes a command you can just use --mapper
"my.py --filter f1,f2...". If you're thinking about the HQL where
clause, you'll be able to specify -D map.hypertable.hql="select c1, c2
from sometable where ..."

__Luke

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Hypertable Development" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/hypertable-dev?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to