Ok. Now, I got your point. I didn't notice the "checkAndPut". regards!
Yong On Mon, Feb 18, 2013 at 1:11 PM, Michael Segel <michael_se...@hotmail.com> wrote: > > The issue I was talking about was the use of a check and put. > The OP wrote: >>>>> each map inserts to doc table.(checkAndPut) >>>>> regionobserver coprocessor does a postCheckAndPut and inserts some rows to >>>>> a index table. > > My question is why does the OP use a checkAndPut, and the RegionObserver's > postChecAndPut? > > > Here's a good example... > http://stackoverflow.com/questions/13404447/is-hbase-checkandput-latency-higher-than-simple-put > > The OP doesn't really get in to the use case, so we don't know why the Check > and Put in the M/R job. > He should just be using put() and then a postPut(). > > Another issue... since he's writing to a different HTable... how? Does he > create an HTable instance in the start() method of his RO object and then > reference it later? Or does he create the instance of the HTable on the fly > in each postCheckAndPut() ? > Without seeing his code, we don't know. > > Note that this is synchronous set of writes. Your overall return from the M/R > call to put will wait until the second row is inserted. > > Interestingly enough, you may want to consider disabling the WAL on the write > to the index. You can always run a M/R job that rebuilds the index should > something occur to the system where you might lose the data. Indexes *ARE* > expendable. ;-) > > Does that explain it? > > -Mike > > On Feb 18, 2013, at 4:57 AM, yonghu <yongyong...@gmail.com> wrote: > >> Hi, Michael >> >> I don't quite understand what do you mean by "round trip back to the >> client". In my understanding, as the RegionServer and TaskTracker can >> be the same node, MR don't have to pull data into client and then >> process. And you also mention the "unnecessary overhead", can you >> explain a little bit what operations or data processing can be seen as >> "unnecessary overhead". >> >> Thanks >> >> yong >> On Mon, Feb 18, 2013 at 10:35 AM, Michael Segel >> <michael_se...@hotmail.com> wrote: >>> Why? >>> >>> This seems like an unnecessary overhead. >>> >>> You are writing code within the coprocessor on the server. Pessimistic >>> code really isn't recommended if you are worried about performance. >>> >>> I have to ask... by the time you have executed the code in your >>> co-processor, what would cause the initial write to fail? >>> >>> >>> On Feb 18, 2013, at 3:01 AM, Prakash Kadel <prakash.ka...@gmail.com> wrote: >>> >>>> its a local read. i just check the last param of PostCheckAndPut >>>> indicating if the Put succeeded. Incase if the put success, i insert a row >>>> in another table >>>> >>>> Sincerely, >>>> Prakash Kadel >>>> >>>> On Feb 18, 2013, at 2:52 PM, Wei Tan <w...@us.ibm.com> wrote: >>>> >>>>> Is your CheckAndPut involving a local or remote READ? Due to the nature of >>>>> LSM, read is much slower compared to a write... >>>>> >>>>> >>>>> Best Regards, >>>>> Wei >>>>> >>>>> >>>>> >>>>> >>>>> From: Prakash Kadel <prakash.ka...@gmail.com> >>>>> To: "user@hbase.apache.org" <user@hbase.apache.org>, >>>>> Date: 02/17/2013 07:49 PM >>>>> Subject: coprocessor enabled put very slow, help please~~~ >>>>> >>>>> >>>>> >>>>> hi, >>>>> i am trying to insert few million documents to hbase with mapreduce. To >>>>> enable quick search of docs i want to have some indexes, so i tried to use >>>>> the coprocessors, but they are slowing down my inserts. Arent the >>>>> coprocessors not supposed to increase the latency? >>>>> my settings: >>>>> 3 region servers >>>>> 60 maps >>>>> each map inserts to doc table.(checkAndPut) >>>>> regionobserver coprocessor does a postCheckAndPut and inserts some rows to >>>>> a index table. >>>>> >>>>> >>>>> Sincerely, >>>>> Prakash >>>>> >>>> >>> >>> Michael Segel | (m) 312.755.9623 >>> >>> Segel and Associates >>> >>> >> >