Hi Ted, Sure, I would like to revive it. My bad that i didnt wrap up the patch. I am also in the middle of making this coprocessor handle "nulls first" and "nulls last" clause. I am targeting to do that in a month or so. Thanks for reminding me.
~Anil On Mon, Oct 14, 2013 at 3:34 PM, Ted Yu <[email protected]> wrote: > Anil: > bq. We also use CP's wherever they are appropriate(like HBASE-7474). > > HBASE-7474 has been dormant for several months. Do you want to revive it ? > > Cheers > > > On Mon, Oct 14, 2013 at 3:25 PM, anil gupta <[email protected]> wrote: > > > Inline. > > > > > > On Mon, Oct 14, 2013 at 7:50 AM, Michael Segel < > [email protected] > > >wrote: > > > > > Anil, > > > > > > I wasn't suggesting that you can't do what you're doing, but you end up > > > running in to the risks which coprocessors are supposed to remove. The > > > standard YMMV always applies. > > > > > Agree with you. But, as per my knowledge and experience with > coprocessors, > > they are meant to be used for operations that are local to RS. Otherwise, > > you are in danger of running into deadlocks, scalability issues. > > > > > > > > You have a cluster… another team in your company wants to use the > > cluster. > > > So instead of the cluster being a single resource for your app/team, it > > now > > > becomes a shared resource. So now you have people accessing HBase for > > > multiple apps. > > > > > Well, its a separation of responsibility in this case. We don't want > teams > > to step each other toes and at the same time work well as an ecosystem. > > Rule: Other teams can use same cluster. But they cannot write directly > into > > the tables that we own/control. If they want to write into our tables > then > > they have to use our HBase Client. > > > > > > > > You could then run multiple HBase HMasters with different locations for > > > files, however… this can get messy. > > > HOYA seems to suggest this as the future. If so, then you have to > wonder > > > about data locality. > > > > > HOYA is not even in beta at present. So, right now we are not thinking > > about it. > > > > > > > > Having your app update the primary table and then the secondary index > is > > > always a good fallback, however you need to ensure that you understand > > the > > > risks. > > > > > Agree, i understand that there is risk. But, you have to bite the bullet > > when you are doing something that is not supported out of the box. We > also > > use CP's wherever they are appropriate(like HBASE-7474). > > > > > > > > With respect to secondary indexes… if you decouple the writes… you can > > get > > > better throughput. Note that the code becomes a bit more complex > because > > > you're going to have to introduce a couple of different things. But > > thats > > > something for a different discussion… > > > > > Whether to use CP or not, depends on the use case. In my opinion, CP's > are > > really powerful and an awesome feature in HBase. But, sometimes if not > used > > properly(like creating a Cyclic Graph as per Tom's example), they might > be > > problematic. > > > > > > > > > > On Oct 13, 2013, at 10:15 AM, anil gupta <[email protected]> > wrote: > > > > > > > Inline. > > > > > > > > On Sun, Oct 13, 2013 at 6:02 AM, Michael Segel < > > > [email protected]>wrote: > > > > > > > >> Ok… > > > >> > > > >> Sure you can have your app update the secondary index table. > > > >> The only issue with that is if someone updates the base table > outside > > of > > > >> your app, > > > >> they may or may not increment the secondary index. > > > >> > > > > Anil: We dont allow people to write data into HBase from their own > > HBase > > > > client. We control the writes into HBase. So, we dont have the > problem > > of > > > > secondary index not getting written. > > > > For example, If you expose a restful web service you can easily > control > > > the > > > > writes to HBase. Even, if user requests to write one row in "main > > table", > > > > you application can have the logic to writing in "Secondary index" > > > tables. > > > > In this way, it is transparent to users also. You can add/remove > > seconday > > > > indexes as you want. > > > > > > > >> Note that your secondary index doesn't have to be an inverted table, > > but > > > >> could be SOLR, LUCENE or something else. > > > >> > > > > Anil:As of now, we are happy with Inverted tables as they fit to our > > use > > > > case. > > > > > > > >> > > > >> So you really want to secondary indexes on the server. > > > >> > > > >> There are a couple of things that could improve the performance, > > > although > > > >> the write to the secondary index would most likely lag under heavy > > load. > > > >> > > > >> > > > >> On Oct 12, 2013, at 11:27 PM, anil gupta <[email protected]> > > wrote: > > > >> > > > >>> John, > > > >>> > > > >>> My 2 cents: > > > >>> I tried implementing Secondary Index by using Region Observers on > > Put. > > > It > > > >>> works well under low load. But, under heavy load the RO could not > > keep > > > up > > > >>> with load cross region server writes. > > > >>> Then, i decided not to use RO as per Andrew's explanation and I > > moved > > > >> all > > > >>> the logic of building secondary index tables on my HBase Client . > > Since > > > >>> then, the system has been running fine under heavy load. > > > >>> IMO, if you will use RO and do cross RS read/write then perhaps > this > > > will > > > >>> become your bottleneck in HBase. > > > >>> Is it possible for you to avoid RO and control the writes/updates > > from > > > >>> client side? > > > >>> > > > >>> Thanks, > > > >>> Anil Gupta > > > >>> > > > >>> > > > >>> On Fri, Oct 11, 2013 at 6:06 PM, John Weatherford < > > > >>> [email protected]> wrote: > > > >>> > > > >>>> OP Here :) > > > >>>> > > > >>>> Our current design involves a Region Observer on a table that does > > > >>>> increments on a second table. We took the approach that Michael > said > > > and > > > >>>> inside the RO, we got a new connection and everything. We believe > > this > > > >> is > > > >>>> causing deadlocks for us. Our next attempt is going to be writing > to > > > >>>> another row in the same table where we will store the increments. > If > > > >> this > > > >>>> doesn't work, we are going to simply pull the increments out of > the > > RO > > > >> and > > > >>>> do them in the application or in Flume. > > > >>>> > > > >>>> @Tom Brown > > > >>>> I would be very interested to hear more about your solution of > > > >>>> aggregating the increments in another system that is then > > responsible > > > >> for > > > >>>> updating in Hbase. > > > >>>> > > > >>>> -jW > > > >>>> > > > >>>> > > > >>>> On Fri 11 Oct 2013 10:26:58 AM PDT, Vladimir Rodionov wrote: > > > >>>> > > > >>>>> With respect to the OP's design… does the deadlock occur because > > he's > > > >>>>>>> trying to update a column in a different row within the same > > table? > > > >>>>>>> > > > >>>>>> > > > >>>>> Because he is trying to update *row* in a different Region (and > > > >>>>> potentially in different RS). > > > >>>>> > > > >>>>> Best regards, > > > >>>>> Vladimir Rodionov > > > >>>>> Principal Platform Engineer > > > >>>>> Carrier IQ, www.carrieriq.com > > > >>>>> e-mail: [email protected] > > > >>>>> > > > >>>>> ______________________________**__________ > > > >>>>> From: Michael Segel [[email protected]] > > > >>>>> Sent: Friday, October 11, 2013 9:10 AM > > > >>>>> To: [email protected] > > > >>>>> Cc: Vladimir Rodionov > > > >>>>> Subject: Re: Coprocessor Increments > > > >>>>> > > > >>>>> > > > >>>>> Confidentiality Notice: The information contained in this > message, > > > >>>>> including any attachments hereto, may be confidential and is > > intended > > > >> to be > > > >>>>> read only by the individual or entity to whom this message is > > > >> addressed. If > > > >>>>> the reader of this message is not the intended recipient or an > > agent > > > or > > > >>>>> designee of the intended recipient, please note that any review, > > use, > > > >>>>> disclosure or distribution of this message or its attachments, in > > any > > > >> form, > > > >>>>> is strictly prohibited. If you have received this message in > > error, > > > >> please > > > >>>>> immediately notify the sender and/or > [email protected] > > > >>>>> delete or destroy any copy of this message and its attachments. > > > >>>>> > > > >>>> > > > >>> > > > >>> > > > >>> -- > > > >>> Thanks & Regards, > > > >>> Anil Gupta > > > >> > > > >> > > > > > > > > > > > > -- > > > > Thanks & Regards, > > > > Anil Gupta > > > > > > > > > > > > -- > > Thanks & Regards, > > Anil Gupta > > > -- Thanks & Regards, Anil Gupta
