Thanks for the detailed response, Jon. bq. it would mean that a query based on secondary index would potentially have to hit every region server that has a region in the primary table.
Can you elaborate on the above a little bit ? Is this because secondary index would point us to more than one region in the data table because several versions are saved for the same row ? My thinking was to ease management of simultaneous (data and index) region split through region colocation. Cheers On Wed, Aug 29, 2012 at 6:47 AM, Jonathan Hsieh <[email protected]> wrote: > I'm more of a fan of having secondary indexes added as an external feature > (coproc or new client library on top of our current client library) and > focusing on only adding apis necessary to make 2ndary indexes possible and > correct on/in HBase. There are many different use patterns and > requirements and one style of secondary index will not be good for > everything. Do we only care about this working well for highly selectivity > keys? What are possible indexes (col name, value, value prefix, everything > our filters support?) Do we care more about writes or reads, ACID > correctness or speed, etc? Also, there are several questions about how we > handle other features in conjunction with 2ndary indexes: replication, bulk > load, snapshots, to name a few. > > Maybe it makes sense to spend some time defining what we want to index > secondarily and what a user api to this external api would be. Then we > could have the different implementations under-the-covers, and allow for > users to swap implementations for the tradeoffs that fit their use cases. > It wouldn't be free to change but hopefully "easy" from a user point of > view. > > Personally, I've tend to favor more of a percolator-style implementation -- > it is a client library and built on top of hbase. This approach seems to be > more "HBase-style" with it's emphasis consistency and atomicity, and seems > to require only a few mondifications to HBase core. Sure it likely slower > than my read of Jesse's proposal, but it seems always always consistent and > thus predictable in cases where there are failures on deletes and updates. > We'd need HBase API primitives like checkAndMutate call (check with > multiple delete/put on the same row), and possibly an atomic multitable > bulkload. I'm not sure that it is replication compatible, and there are > probably questions we'll need to answer once snapshots solidifies. > > Ted's idea of colocating regions (like the index table's > regions) definitely feels like a primitive (pluggable, likely-per-table > region assignment plans) that we could add to HBase core. This requirement > though for 2ndary indexes seems to imply an approach similar to cassandra's > approach -- having a local index of each region on region server and > colocating them. Is this right? If so, this is essentially a filtering > optimization -- it would mean that a query based on secondary index would > potentially have to hit every region server that has a region in the > primary table. This is great approach if the index lookup has high > cardinality but if the secondary index is highly selective, you'd have to > march through a bunch or RS's before getting an answer. > > Jon. > > On Tue, Aug 28, 2012 at 9:18 PM, Ramkrishna.S.Vasudevan < > [email protected]> wrote: > > > Hi > > > > Yes I was talking about the dead entry in the index table rather than the > > actual data table. > > > > Regards > > Ram > > > > > -----Original Message----- > > > From: Wei Tan [mailto:[email protected]] > > > Sent: Tuesday, August 28, 2012 9:22 PM > > > To: [email protected] > > > Cc: Sandeep Tata > > > Subject: Re: A general question on maxVersion handling when we have > > > Secondary index tables > > > > > > Thanks for sharing a pointer to your implementation. > > > My two cents: > > > timestamp is a way to do MVCC and setting every KV with the same TS > > > will > > > get concurrency control very tricky and error prone, if not impossible > > > I think Ram is talking about the dead entry in the index table rather > > > than > > > data table. Deleting old index entries upfront when there is a new put > > > might be a choice. > > > > > > > > > Best Regards, > > > Wei > > > > > > Wei Tan > > > Research Staff Member > > > IBM T. J. Watson Research Center > > > 19 Skyline Dr, Hawthorne, NY 10532 > > > [email protected]; 914-784-6752 > > > > > > > > > > > > From: Jesse Yates <[email protected]> > > > To: [email protected], > > > Date: 08/28/2012 04:00 AM > > > Subject: Re: A general question on maxVersion handling when we > > > have > > > Secondary index tables > > > > > > > > > > > > Ram, > > > > > > If I understand correctly, I think you can design your index such that > > > you > > > don't actually use the timestamp (e.g. everything gets put with a TS = > > > 10 > > > - > > > or some other non-special, relatively small number that's not 0 as I'd > > > worry about that in HBase ;) Then when you set maxVersions to 1, > > > everything > > > should be good. > > > > > > You get a couple of wasted bytes from the TS, but with the prefixTrie > > > stuff > > > that should be pretty minimal overhead. If you do need to keep track of > > > the > > > timestamp you should be able to munge that back up into the column > > > qualifier (and just know that that last 64 bits is the timestamp). > > > Again a > > > little more CPU cost, but its really not that big of an overhead. It > > > seems > > > like you don't really care about the TS though, in which case this > > > should > > > be pretty simple. > > > > > > Out of curiosity, what are people using for their secondary indexing > > > solutions? I know there are a bunch out there, but don't know what > > > people > > > have adopted, what they like/dislike, design tradeoffs made and why. > > > > > > Disclaimer: I recently proposed a secondary indexing solution myself > > > (shameless self-plug: > > > http://jyates.github.com/2012/07/09/consistent-enough-secondary- > > > indexes.html > > > ) > > > and its something I'm working on for Salesforce - open sourced at some > > > point, promise! > > > > > > -Jesse > > > ------------------- > > > Jesse Yates > > > @jesse_yates > > > jyates.github.com > > > > > > > > > On Tue, Aug 28, 2012 at 12:24 AM, Ramkrishna.S.Vasudevan < > > > [email protected]> wrote: > > > > > > > Hi All > > > > > > > > > > > > > > > > When we try to build any type of secondary indices for a given table > > > how > > > > can > > > > one handle maxVersions in the secondary index tables. > > > > > > > > > > > > > > > > For eg, > > > > > > > > I have inserted > > > > > > > > Row1 - Val1 => t > > > > > > > > Row1 - Val2 => t+1 > > > > > > > > Row1 - Val3. => t+2 > > > > > > > > > > > > > > > > Ideally if my max versions is only one then Val3 should be my result > > > If > > > I > > > > query on main table for row1. > > > > > > > > > > > > > > > > Now in my index I will be having all the above 3 entries. Now how > > > can > > > we > > > > remove the older entries from the index table that does not fit into > > > > maxVersions. > > > > > > > > > > > > > > > > Currently while scanning and the code that avoids the max Versions > > > does > > > not > > > > give any hooks to know the entries skipped thro versions. > > > > > > > > So any suggestions on this, I am still seeing the code for any other > > > > options > > > > but suggestions welcome. > > > > > > > > > > > > > > > > Regards > > > > > > > > Ram > > > > > > > > > > > > > > > > > -- > // Jonathan Hsieh (shay) > // Software Engineer, Cloudera > // [email protected] >
