While in Beijing I met with a group at the Institute of Computing at the 
Chinese Academy of Sciences who are interested in contributing a secondary 
indexing scheme for HBase. It is my understanding this is the same group that 
contributed RCFile to Hive. See at the links below a slide deck and technical 
report describing what they have done, called CCIndex.

Slides: https://iridiant.s3.amazonaws.com/ccindex_v1.pdf
Paper: https://iridiant.s3.amazonaws.com/CCIndex.pdf

We discussed initially posting their code -- based on 0.20.1 -- up on GitHub 
and this was agreed. This should be happening soon.

We also discussed a possible path for contribution of this work in 
maintainable/distributable form as a coprocessor based reimplementation, 
considering support in the framework for what CCindex needs at a low level (I/O 
concerns), and splitting out the rest into a coprocessor. I've heard other talk 
of implementing secondary indexing using a coprocessor foundation. I think 
CCIndex is one option on the table, a starting point for discussion.

Best regards,

    - Andy


      

Reply via email to