Yes, that was a big concern I had as well. It's not clear how that will work with a large number of indexes; if people have one index, they will want more than one. To not plan for that seems like an incomplete implementation to me. In a horizontally scalable system like HBase, lots of buddy region isn't going to work out well..* Once we have regions that cannot be collocated, the extra RPC time starts to be the biggest factor (as the doc points out) and we are back to what Phoenix is already doing**.
But I'm probably missing something here in what makes it different? For folks that haven't been following the issue some high-level "how it all kinda works" would be helpful from the championing commiters; that's a long doc to get through and grok :). How similar is this to the work currently by the existing indexing implementations (huawei, Phoenix, ngdata)? The doc doesn't really nail down the interactions, but instead just right in after describing why SI should be added. Agree this would be super useful, but don't want to waste too much work reinventing the wheel or doing the wrong thing. further, this impl quickly starts to lead down the query optimization path, which get HBase away from its core "be a great byte store". Like I said, I'm all for secondary indexes in HBase and think this is a great push. I don't mean to rain on any parades. - jesse * but a smart way to specify region collocation? That I can get behind as it would unify a couple different indexing impls (e.g Phoenix would consider using it to help make indexing faster - RPCs do suck). ** for instance, the doc talks about how to implement indexing for floats... That might be a default impl, but for use cases like Phoenix this would break all our current encodings. We handled this is the indexing impl by making the builder pluggable for different use cases to support different encodings. I feel like a lot of the code for this kind of SI impl is already in Phoenix and has been working and fast for several months now; it's surprisingly tricky, especially with the delete cases and time stamp manipulation issues. On Thursday, January 9, 2014, Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN) wrote: > Could you explain how the 1-1 association between user and index table > regions is maintained. I wasn't able to understand fully from the document. > > ----- Original Message ----- > From: Ted Yu <[email protected]> > To: [email protected] > At: Jan 8, 2014 3:41:40 PM > > Hi, > Secondary index support is a frequently requested feature. > > Please find the updated design doc here: > > https://issues.apache.org/jira/secure/attachment/12621909/SecondaryIndex%20Design_Updated_2.pdf > > HBASE-9203 is the umbrella JIRA. > > Implementation patch was attached to HBASE-10222 > > Thanks to Rajesh who works on this feature. > > Cheers > -- ------------------- Jesse Yates @jesse_yates jyates.github.com
