Well it also goes back to the question of how the RO is writing to the second table.
I would imagine that if the M/R uses Mapper.setup() to instantiate the HTable for the index write and then in Mapper.map() writes to the index table, why would the co-processor take much more time? I think a code review would be in order. On Feb 18, 2013, at 6:22 AM, yonghu <yongyong...@gmail.com> wrote: > Ok. Now, I got your point. I didn't notice the "checkAndPut". > > regards! > > Yong > > On Mon, Feb 18, 2013 at 1:11 PM, Michael Segel > <michael_se...@hotmail.com> wrote: >> >> The issue I was talking about was the use of a check and put. >> The OP wrote: >>>>>> each map inserts to doc table.(checkAndPut) >>>>>> regionobserver coprocessor does a postCheckAndPut and inserts some rows >>>>>> to >>>>>> a index table. >> >> My question is why does the OP use a checkAndPut, and the RegionObserver's >> postChecAndPut? >> >> >> Here's a good example... >> http://stackoverflow.com/questions/13404447/is-hbase-checkandput-latency-higher-than-simple-put >> >> The OP doesn't really get in to the use case, so we don't know why the Check >> and Put in the M/R job. >> He should just be using put() and then a postPut(). >> >> Another issue... since he's writing to a different HTable... how? Does he >> create an HTable instance in the start() method of his RO object and then >> reference it later? Or does he create the instance of the HTable on the fly >> in each postCheckAndPut() ? >> Without seeing his code, we don't know. >> >> Note that this is synchronous set of writes. Your overall return from the >> M/R call to put will wait until the second row is inserted. >> >> Interestingly enough, you may want to consider disabling the WAL on the >> write to the index. You can always run a M/R job that rebuilds the index >> should something occur to the system where you might lose the data. Indexes >> *ARE* expendable. ;-) >> >> Does that explain it? >> >> -Mike >> >> On Feb 18, 2013, at 4:57 AM, yonghu <yongyong...@gmail.com> wrote: >> >>> Hi, Michael >>> >>> I don't quite understand what do you mean by "round trip back to the >>> client". In my understanding, as the RegionServer and TaskTracker can >>> be the same node, MR don't have to pull data into client and then >>> process. And you also mention the "unnecessary overhead", can you >>> explain a little bit what operations or data processing can be seen as >>> "unnecessary overhead". >>> >>> Thanks >>> >>> yong >>> On Mon, Feb 18, 2013 at 10:35 AM, Michael Segel >>> <michael_se...@hotmail.com> wrote: >>>> Why? >>>> >>>> This seems like an unnecessary overhead. >>>> >>>> You are writing code within the coprocessor on the server. Pessimistic >>>> code really isn't recommended if you are worried about performance. >>>> >>>> I have to ask... by the time you have executed the code in your >>>> co-processor, what would cause the initial write to fail? >>>> >>>> >>>> On Feb 18, 2013, at 3:01 AM, Prakash Kadel <prakash.ka...@gmail.com> wrote: >>>> >>>>> its a local read. i just check the last param of PostCheckAndPut >>>>> indicating if the Put succeeded. Incase if the put success, i insert a >>>>> row in another table >>>>> >>>>> Sincerely, >>>>> Prakash Kadel >>>>> >>>>> On Feb 18, 2013, at 2:52 PM, Wei Tan <w...@us.ibm.com> wrote: >>>>> >>>>>> Is your CheckAndPut involving a local or remote READ? Due to the nature >>>>>> of >>>>>> LSM, read is much slower compared to a write... >>>>>> >>>>>> >>>>>> Best Regards, >>>>>> Wei >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> From: Prakash Kadel <prakash.ka...@gmail.com> >>>>>> To: "user@hbase.apache.org" <user@hbase.apache.org>, >>>>>> Date: 02/17/2013 07:49 PM >>>>>> Subject: coprocessor enabled put very slow, help please~~~ >>>>>> >>>>>> >>>>>> >>>>>> hi, >>>>>> i am trying to insert few million documents to hbase with mapreduce. To >>>>>> enable quick search of docs i want to have some indexes, so i tried to >>>>>> use >>>>>> the coprocessors, but they are slowing down my inserts. Arent the >>>>>> coprocessors not supposed to increase the latency? >>>>>> my settings: >>>>>> 3 region servers >>>>>> 60 maps >>>>>> each map inserts to doc table.(checkAndPut) >>>>>> regionobserver coprocessor does a postCheckAndPut and inserts some rows >>>>>> to >>>>>> a index table. >>>>>> >>>>>> >>>>>> Sincerely, >>>>>> Prakash >>>>>> >>>>> >>>> >>>> Michael Segel | (m) 312.755.9623 >>>> >>>> Segel and Associates >>>> >>>> >>> >> >