Re: Design review: Secondary index support through coprocess

Jesse Yates Thu, 09 Jan 2014 06:41:25 -0800

Yes, that was a big concern I had as well.

It's not clear how that will work with a large number of indexes; if people
have one index, they will want more than one. To not plan for that seems
like an incomplete implementation to me. In a horizontally scalable system
like HBase, lots of buddy region isn't going to work out well..* Once we
have regions that cannot be collocated, the extra RPC time starts to be the
biggest factor (as the doc points out) and we are back to what Phoenix is
already doing**.

But I'm probably missing something here in what makes it different?

For folks that haven't been following the issue some high-level "how it all
kinda works" would be helpful from the championing commiters; that's a long
doc to get through and grok :). How similar is this to the work currently
by the existing indexing implementations (huawei, Phoenix, ngdata)? The doc
doesn't really nail down the interactions, but instead just right in after
describing why SI should be added.

Agree this would be super useful, but don't want to waste too much work
reinventing the wheel or doing the wrong thing. further, this impl quickly
starts to lead down the query optimization path, which get HBase away from
its core "be a great byte store".

Like I said, I'm all for secondary indexes in HBase and think this is a
great push. I don't mean to rain on any parades.

- jesse

* but a smart way to specify region collocation? That I can get behind as
it would unify a couple different indexing impls (e.g Phoenix would
consider using it to help make indexing faster - RPCs do suck).

** for instance, the doc talks about how to implement indexing for
floats... That might be a default impl, but for use cases like Phoenix this
would break all our current encodings. We handled this is the indexing impl
by making the builder pluggable for different use cases to support
different encodings. I feel like a lot of the code for this kind of SI
impl is already in Phoenix and has been working and fast for several months
now; it's surprisingly tricky, especially with the delete cases and time
stamp manipulation issues.

On Thursday, January 9, 2014, Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN)
wrote:

> Could you explain how the 1-1 association between user and index table
> regions is maintained. I wasn't able to understand fully from the document.
>
> ----- Original Message -----
> From: Ted Yu <[email protected]>
> To: [email protected]
> At: Jan 8, 2014 3:41:40 PM
>
> Hi,
> Secondary index support is a frequently requested feature.
>
> Please find the updated design doc here:
>
> https://issues.apache.org/jira/secure/attachment/12621909/SecondaryIndex%20Design_Updated_2.pdf
>
> HBASE-9203 is the umbrella JIRA.
>
> Implementation patch was attached to HBASE-10222
>
> Thanks to Rajesh who works on this feature.
>
> Cheers
>

-- 
-------------------
Jesse Yates
@jesse_yates
jyates.github.com

Re: Design review: Secondary index support through coprocess

Reply via email to