[ https://issues.apache.org/jira/browse/HBASE-2000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrew Purtell resolved HBASE-2000. ----------------------------------- Resolution: Fixed Fix Version/s: 0.92.0 Assignee: (was: Andrew Purtell) Hadoop Flags: [Reviewed] > Coprocessors > ------------ > > Key: HBASE-2000 > URL: https://issues.apache.org/jira/browse/HBASE-2000 > Project: HBase > Issue Type: New Feature > Components: coprocessors > Reporter: Andrew Purtell > Fix For: 0.92.0 > > > From Google's Jeff Dean, in a keynote to LADIS 2009 > (http://www.scribd.com/doc/21631448/Dean-Keynote-Ladis2009, slides 66 - 67): > BigTable Coprocessors (New Since OSDI'06) > * Arbitrary code that runs run next to each tablet in table > ** As tablets split and move, coprocessor code automatically splits/moves > too > * High-level call interface for clients > ** Unlike RPC, calls addressed to rows or ranges of rows > * coprocessor client library resolves to actual locations > ** Calls across multiple rows automatically split into multiple > parallelized RPCs > * Very flexible model for building distributed services > ** Automatic scaling, load balancing, request routing for apps > Example Coprocessor Uses > * Scalable metadata management for Colossus (next gen GFS-like file system) > * Distributed language model serving for machine translation system > * Distributed query processing for full-text indexing support > * Regular expression search support for code repository > For HBase, adding a coprocessor framework will allow for pluggable > incremental addition of functionality. No more need to subclass the > regionserver interface and implementation classes and set > {{hbase.regionserver.class}} and {{hbase.regionserver.impl}} in > hbase-site.xml. That mechanism allows for extension but at the exclusion of > all others. > Also in HBASE-2001 currently there is a in-process map reduce framework for > the regionservers. Coprocessors can optionally implement a 'MapReduce' > interface which clients will be able to invoke concurrently on all regions of > the table. Note this is not MapReduce on the table; this is MapReduce on each > region, concurrently. One can implement MapReduce in a manner very similar to > Hadoop's MR framework, or use shared variables to avoid the overhead of > generating (and processing) a lot of intermediates. An initial application of > this could be support for rapid calculation of aggregates over data stored in > HBase. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira