Wow. That's the first time in 25 years that I've heard someone actually reference the dining philosophers problem. ;-)
On Jan 22, 2014, at 1:35 PM, Wei Tan <w...@us.ibm.com> wrote: > Thanks, Vladimir. So a RPC call RS1 --> RS2 takes two handlers, one from > RS1 and one from RS2? If that is true, then I understand that it is a > typical Dining philosophers problem. > > Maybe a random yielding mechanism can solve this problem. > Best regards, > Wei > > --------------------------------- > Wei Tan, PhD > Research Staff Member > IBM T. J. Watson Research Center > http://researcher.ibm.com/person/us-wtan > > > > From: Vladimir Rodionov <vrodio...@carrieriq.com> > To: "dev@hbase.apache.org" <dev@hbase.apache.org>, > Date: 01/22/2014 12:09 PM > Subject: RE: Design review: Secondary index support through > coprocess > > > > Deadlocks are possible because cross region RPCs create cyclic > dependencies in HBase cluster. > > RS1-> RS2->RS3->RS1, where -> is PRC call > > now imagine that last call from RS3 to RS1 is blocked because there no > more available handler threads to process it. > > Best regards, > Vladimir Rodionov > Principal Platform Engineer > Carrier IQ, www.carrieriq.com > e-mail: vrodio...@carrieriq.com > > ________________________________________ > From: Wei Tan [w...@us.ibm.com] > Sent: Wednesday, January 22, 2014 7:51 AM > To: dev@hbase.apache.org > Subject: RE: Design review: Secondary index support through coprocess > > Why cross-RS RPC is going to cause deadlocks? It is a matter of logic > incorrectness, or resource outage? Say, if we set the #handler to be > large, logically deadlock still occurs? > Best regards, > Wei > > > > > From: Vladimir Rodionov <vrodio...@carrieriq.com> > To: "dev@hbase.apache.org" <dev@hbase.apache.org>, > Date: 01/20/2014 03:00 PM > Subject: RE: Design review: Secondary index support through > coprocess > > > >>> Yes, the coprocessors potentially cross RS boundaries. > > The open path to the disaster. Inter region RPCs in coprocessors may > result in periodic cluster - wide deadlocks > > > Best regards, > Vladimir Rodionov > Principal Platform Engineer > Carrier IQ, www.carrieriq.com > e-mail: vrodio...@carrieriq.com > > ________________________________________ > From: James Taylor [jtay...@salesforce.com] > Sent: Monday, January 20, 2014 11:39 AM > To: dev@hbase.apache.org > Subject: Re: Design review: Secondary index support through coprocess > > Yes, the coprocessors potentially cross RS boundaries. No, the index is > not > co-located with the main table. Take a look at the link I sent as that > should be able to answer a lot of questions. > > Thanks, > James > > > On Mon, Jan 20, 2014 at 11:03 AM, Michael Segel > <michael_se...@hotmail.com>wrote: > >> James, >> >> Ok… >> >> Its been a while since we talked about this… >> >> While the index is in a separate table, is that table being split and >> collocated with the main table? >> >> If you’re using the coprocessor to maintain the index, that would imply >> you’re crossing RS boundaries if your index is truly orthogonal. >> >> Is this what you’re doing? >> >> On Jan 20, 2014, at 11:32 AM, James Taylor <jtay...@salesforce.com> > wrote: >> >>> Mike, >>> Yes, you're mistaken: >>> - secondary indexes in Phoenix are orthogonal to the base table. > They're >> in >>> a separate table ( >>> http://phoenix.incubator.apache.org/secondary_indexing.html). >>> - Phoenix has joins. They're in our master branch with a release >> scheduled >>> for next month >>> - numeric strings? Not a use case for indexing numeric data? Have you >> ever >>> seen a number used as an ID? >>> Thanks, >>> James >>> >>> >>> On Mon, Jan 20, 2014 at 8:50 AM, Michael Segel < >> michael_se...@hotmail.com>wrote: >>> >>>> Indexes tend to be orthogonal to the base table, not to mention if >> you’re >>>> using an inverted table for an index, your index table would be much >>>> thinner than your base table. >>>> >>>> Having said that, the solution proposed by Yu, Taylor and others only >>>> works if you want to use the index to help on server side filtering > and >>>> misses the boat on the larger and broader picture of improving query >>>> optimization and joins. >>>> >>>> HINT: Unless I am mistaken… until you treat the index as orthogonal > to >> the >>>> base table, you will always lag performance of traditional MPP DWs > like >>>> Informix XPS. (Now part of IBM’s IM pillar ) >>>> >>>> In addition, until you fix coprocessors in general, you will have >>>> scalability and performance issues. >>>> (Note that you can write a coprocessor to create a sandbox and > separate >>>> the co-process from the RS jvm, however it would be better if it were >> part >>>> of the underlying coprocessor code. ) >>>> >>>> The current implementation makes joins worthless. >>>> (Note that in prior discussions, Phoenix doesn’t do joins…) >>>> Here’s why: >>>> In order to do a join, if you use the proposed index, you have to > first >>>> reduce each index in to a single, sort ordered set. Then you can > take >> the >>>> intersection of the index result sets. The final set would be in > sort >>>> order and a subset of the total rows. You can then fetch the rows and >> still >>>> do a server side filter before returning the ultimate result set. >>>> >>>> Its that first step of reducing each result set in to a single sort >>>> ordered set that takes a lot of effort. >>>> >>>> >>>> On a side note…. there’s been some mention of ordering floats. Again, >> just >>>> a word of caution… there isn’t a really strong use case for indexing >>>> numeric data types. period. And to be very, very clear, there is a >>>> distinction between numeric strings and numeric data types. >>>> >>>> -Mike >>>> >>>> PS. Because of my role as a consultant, I am very, very limited in > what >> I >>>> can say and contribute. I don’t own my work product, my clients do. > Take >>>> what I say with a grain of salt. I’m just a skinny little boy from >>>> Cleveland Ohio, come to chase your beers and drink your women… ;-) >>>> >>>> On Jan 9, 2014, at 10:48 AM, James Taylor <jtay...@salesforce.com> >> wrote: >>>> >>>>> IMHO, it would be valuable if the design considered both a global >>>>> indexing solution and a local indexing solution. Both are useful in >>>>> different circumstances. The global indexing design plus the >>>>> application integration points could be derived from Jesse's work > with >>>>> his reference implementation in Phoenix - the global indexing code > has >>>>> no Phoenix dependencies and clearly defined integration points. >>>>> >>>>> Thanks, >>>>> James >>>>> >>>>> On Jan 9, 2014, at 6:36 AM, Jesse Yates <jesse.k.ya...@gmail.com> >> wrote: >>>>> >>>>>> Yes, that was a big concern I had as well. >>>>>> >>>>>> It's not clear how that will work with a large number of indexes; > if >>>> people >>>>>> have one index, they will want more than one. To not plan for that >> seems >>>>>> like an incomplete implementation to me. In a horizontally scalable >>>> system >>>>>> like HBase, lots of buddy region isn't going to work out well..* > Once >> we >>>>>> have regions that cannot be collocated, the extra RPC time starts > to >> be >>>> the >>>>>> biggest factor (as the doc points out) and we are back to what > Phoenix >>>> is >>>>>> already doing**. >>>>>> >>>>>> But I'm probably missing something here in what makes it different? >>>>>> >>>>>> For folks that haven't been following the issue some high-level > "how >> it >>>> all >>>>>> kinda works" would be helpful from the championing commiters; > that's a >>>> long >>>>>> doc to get through and grok :). How similar is this to the work >>>> currently >>>>>> by the existing indexing implementations (huawei, Phoenix, ngdata)? >> The >>>> doc >>>>>> doesn't really nail down the interactions, but instead just right > in >>>> after >>>>>> describing why SI should be added. >>>>>> >>>>>> Agree this would be super useful, but don't want to waste too much >> work >>>>>> reinventing the wheel or doing the wrong thing. further, this impl >>>> quickly >>>>>> starts to lead down the query optimization path, which get HBase > away >>>> from >>>>>> its core "be a great byte store". >>>>>> >>>>>> Like I said, I'm all for secondary indexes in HBase and think this > is >> a >>>>>> great push. I don't mean to rain on any parades. >>>>>> >>>>>> - jesse >>>>>> >>>>>> * but a smart way to specify region collocation? That I can get > behind >>>> as >>>>>> it would unify a couple different indexing impls (e.g Phoenix would >>>>>> consider using it to help make indexing faster - RPCs do suck). >>>>>> >>>>>> ** for instance, the doc talks about how to implement indexing for >>>>>> floats... That might be a default impl, but for use cases like > Phoenix >>>> this >>>>>> would break all our current encodings. We handled this is the > indexing >>>> impl >>>>>> by making the builder pluggable for different use cases to support >>>>>> different encodings. I feel like a lot of the code for this kind of > SI >>>>>> impl is already in Phoenix and has been working and fast for > several >>>> months >>>>>> now; it's surprisingly tricky, especially with the delete cases and >> time >>>>>> stamp manipulation issues. >>>>>> >>>>>> >>>>>> On Thursday, January 9, 2014, Sudarshan Kadambi (BLOOMBERG/ 731 > LEXIN) >>>>>> wrote: >>>>>> >>>>>>> Could you explain how the 1-1 association between user and index >> table >>>>>>> regions is maintained. I wasn't able to understand fully from the >>>> document. >>>>>>> >>>>>>> ----- Original Message ----- >>>>>>> From: Ted Yu <dev@hbase.apache.org> >>>>>>> To: dev@hbase.apache.org >>>>>>> At: Jan 8, 2014 3:41:40 PM >>>>>>> >>>>>>> Hi, >>>>>>> Secondary index support is a frequently requested feature. >>>>>>> >>>>>>> Please find the updated design doc here: >>>>>>> >>>>>>> >>>> >> > https://issues.apache.org/jira/secure/attachment/12621909/SecondaryIndex%20Design_Updated_2.pdf > > >>>>>>> >>>>>>> HBASE-9203 is the umbrella JIRA. >>>>>>> >>>>>>> Implementation patch was attached to HBASE-10222 >>>>>>> >>>>>>> Thanks to Rajesh who works on this feature. >>>>>>> >>>>>>> Cheers >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> ------------------- >>>>>> Jesse Yates >>>>>> @jesse_yates >>>>>> jyates.github.com >>>>> >>>> >>>> >> >> > > Confidentiality Notice: The information contained in this message, > including any attachments hereto, may be confidential and is intended to > be read only by the individual or entity to whom this message is > addressed. If the reader of this message is not the intended recipient or > an agent or designee of the intended recipient, please note that any > review, use, disclosure or distribution of this message or its > attachments, in any form, is strictly prohibited. If you have received > this message in error, please immediately notify the sender and/or > notificati...@carrieriq.com and delete or destroy any copy of this message > and its attachments. > > > > > Confidentiality Notice: The information contained in this message, > including any attachments hereto, may be confidential and is intended to > be read only by the individual or entity to whom this message is > addressed. If the reader of this message is not the intended recipient or > an agent or designee of the intended recipient, please note that any > review, use, disclosure or distribution of this message or its > attachments, in any form, is strictly prohibited. If you have received > this message in error, please immediately notify the sender and/or > notificati...@carrieriq.com and delete or destroy any copy of this message > and its attachments. > > > The opinions expressed here are mine, while they may reflect a cognitive thought, that is purely accidental. Use at your own risk. Michael Segel michael_segel (AT) hotmail.com