Anil: bq. We also use CP's wherever they are appropriate(like HBASE-7474). HBASE-7474 has been dormant for several months. Do you want to revive it ?
Cheers On Mon, Oct 14, 2013 at 3:25 PM, anil gupta <[email protected]> wrote: > Inline. > > > On Mon, Oct 14, 2013 at 7:50 AM, Michael Segel <[email protected] > >wrote: > > > Anil, > > > > I wasn't suggesting that you can't do what you're doing, but you end up > > running in to the risks which coprocessors are supposed to remove. The > > standard YMMV always applies. > > > Agree with you. But, as per my knowledge and experience with coprocessors, > they are meant to be used for operations that are local to RS. Otherwise, > you are in danger of running into deadlocks, scalability issues. > > > > > You have a cluster… another team in your company wants to use the > cluster. > > So instead of the cluster being a single resource for your app/team, it > now > > becomes a shared resource. So now you have people accessing HBase for > > multiple apps. > > > Well, its a separation of responsibility in this case. We don't want teams > to step each other toes and at the same time work well as an ecosystem. > Rule: Other teams can use same cluster. But they cannot write directly into > the tables that we own/control. If they want to write into our tables then > they have to use our HBase Client. > > > > > You could then run multiple HBase HMasters with different locations for > > files, however… this can get messy. > > HOYA seems to suggest this as the future. If so, then you have to wonder > > about data locality. > > > HOYA is not even in beta at present. So, right now we are not thinking > about it. > > > > > Having your app update the primary table and then the secondary index is > > always a good fallback, however you need to ensure that you understand > the > > risks. > > > Agree, i understand that there is risk. But, you have to bite the bullet > when you are doing something that is not supported out of the box. We also > use CP's wherever they are appropriate(like HBASE-7474). > > > > > With respect to secondary indexes… if you decouple the writes… you can > get > > better throughput. Note that the code becomes a bit more complex because > > you're going to have to introduce a couple of different things. But > thats > > something for a different discussion… > > > Whether to use CP or not, depends on the use case. In my opinion, CP's are > really powerful and an awesome feature in HBase. But, sometimes if not used > properly(like creating a Cyclic Graph as per Tom's example), they might be > problematic. > > > > > > On Oct 13, 2013, at 10:15 AM, anil gupta <[email protected]> wrote: > > > > > Inline. > > > > > > On Sun, Oct 13, 2013 at 6:02 AM, Michael Segel < > > [email protected]>wrote: > > > > > >> Ok… > > >> > > >> Sure you can have your app update the secondary index table. > > >> The only issue with that is if someone updates the base table outside > of > > >> your app, > > >> they may or may not increment the secondary index. > > >> > > > Anil: We dont allow people to write data into HBase from their own > HBase > > > client. We control the writes into HBase. So, we dont have the problem > of > > > secondary index not getting written. > > > For example, If you expose a restful web service you can easily control > > the > > > writes to HBase. Even, if user requests to write one row in "main > table", > > > you application can have the logic to writing in "Secondary index" > > tables. > > > In this way, it is transparent to users also. You can add/remove > seconday > > > indexes as you want. > > > > > >> Note that your secondary index doesn't have to be an inverted table, > but > > >> could be SOLR, LUCENE or something else. > > >> > > > Anil:As of now, we are happy with Inverted tables as they fit to our > use > > > case. > > > > > >> > > >> So you really want to secondary indexes on the server. > > >> > > >> There are a couple of things that could improve the performance, > > although > > >> the write to the secondary index would most likely lag under heavy > load. > > >> > > >> > > >> On Oct 12, 2013, at 11:27 PM, anil gupta <[email protected]> > wrote: > > >> > > >>> John, > > >>> > > >>> My 2 cents: > > >>> I tried implementing Secondary Index by using Region Observers on > Put. > > It > > >>> works well under low load. But, under heavy load the RO could not > keep > > up > > >>> with load cross region server writes. > > >>> Then, i decided not to use RO as per Andrew's explanation and I > moved > > >> all > > >>> the logic of building secondary index tables on my HBase Client . > Since > > >>> then, the system has been running fine under heavy load. > > >>> IMO, if you will use RO and do cross RS read/write then perhaps this > > will > > >>> become your bottleneck in HBase. > > >>> Is it possible for you to avoid RO and control the writes/updates > from > > >>> client side? > > >>> > > >>> Thanks, > > >>> Anil Gupta > > >>> > > >>> > > >>> On Fri, Oct 11, 2013 at 6:06 PM, John Weatherford < > > >>> [email protected]> wrote: > > >>> > > >>>> OP Here :) > > >>>> > > >>>> Our current design involves a Region Observer on a table that does > > >>>> increments on a second table. We took the approach that Michael said > > and > > >>>> inside the RO, we got a new connection and everything. We believe > this > > >> is > > >>>> causing deadlocks for us. Our next attempt is going to be writing to > > >>>> another row in the same table where we will store the increments. If > > >> this > > >>>> doesn't work, we are going to simply pull the increments out of the > RO > > >> and > > >>>> do them in the application or in Flume. > > >>>> > > >>>> @Tom Brown > > >>>> I would be very interested to hear more about your solution of > > >>>> aggregating the increments in another system that is then > responsible > > >> for > > >>>> updating in Hbase. > > >>>> > > >>>> -jW > > >>>> > > >>>> > > >>>> On Fri 11 Oct 2013 10:26:58 AM PDT, Vladimir Rodionov wrote: > > >>>> > > >>>>> With respect to the OP's design… does the deadlock occur because > he's > > >>>>>>> trying to update a column in a different row within the same > table? > > >>>>>>> > > >>>>>> > > >>>>> Because he is trying to update *row* in a different Region (and > > >>>>> potentially in different RS). > > >>>>> > > >>>>> Best regards, > > >>>>> Vladimir Rodionov > > >>>>> Principal Platform Engineer > > >>>>> Carrier IQ, www.carrieriq.com > > >>>>> e-mail: [email protected] > > >>>>> > > >>>>> ______________________________**__________ > > >>>>> From: Michael Segel [[email protected]] > > >>>>> Sent: Friday, October 11, 2013 9:10 AM > > >>>>> To: [email protected] > > >>>>> Cc: Vladimir Rodionov > > >>>>> Subject: Re: Coprocessor Increments > > >>>>> > > >>>>> > > >>>>> Confidentiality Notice: The information contained in this message, > > >>>>> including any attachments hereto, may be confidential and is > intended > > >> to be > > >>>>> read only by the individual or entity to whom this message is > > >> addressed. If > > >>>>> the reader of this message is not the intended recipient or an > agent > > or > > >>>>> designee of the intended recipient, please note that any review, > use, > > >>>>> disclosure or distribution of this message or its attachments, in > any > > >> form, > > >>>>> is strictly prohibited. If you have received this message in > error, > > >> please > > >>>>> immediately notify the sender and/or [email protected] > > >>>>> delete or destroy any copy of this message and its attachments. > > >>>>> > > >>>> > > >>> > > >>> > > >>> -- > > >>> Thanks & Regards, > > >>> Anil Gupta > > >> > > >> > > > > > > > > > -- > > > Thanks & Regards, > > > Anil Gupta > > > > > > > -- > Thanks & Regards, > Anil Gupta >
