Hi folks - New to the community here, but encouraging to see it's alive and active. :)
We're currently assessing HBase as an option to provide tables that are queried directly from a web interface, for an analytics application. I'm wondering if anyone has an example of how a typical analytic data warehouse star and/or snowflake schema design might look in HBase? This stuff is extremely well documented on how to do in a traditional RDBMS, but I can't find a simple example of this type of schema design.
As well, is there a summary of how HBase may compare to BigTable, in terms of features.? Obviously BigTable is closed-source, so we can't know definitively. But I've been looking at http://code.google.com/appengine/docs/python/datastore/, and am wondering if there obvious features that HBase is missing.? Specifically, I'm wondering if BigTable and HBase compare in the areas of:
1. What is done client-side vs. in the data store / BigTable 2. (Distributed?) query support with filters 3. Indexing, (distributed?) sorting, distributed SUM, AVG, etc. 4. Parallelization of queries and writes 5. Anything else relevant Any pointers or help much appreciated. Thx - Adam
