DataWorks Summit/Hadoop Summit - Call for abstracts

2017-02-07 Thread Devaraj Das
?Just a quick note on the DataWorks Summit / Hadoop Summit Should have sent this across earlier but anyway.. The DataWorks Summit/Hadoop Summit Organizing Committee invites you to submit an abstract to be considered for the summit in San Jose on June 13-15. We

Re: Region server dies at regular intervals for unknown reasons.

2017-02-07 Thread Ted Yu
You can search in master log for the region backward. The log would tell you which region server last tried to open it. Pastebin the relevant snippet of region server log pertaining to the attempted open of the region. Thanks On Tue, Feb 7, 2017 at 7:17 PM, Kang Minwoo wrote: > Yes. I agree wi

RE: Region server dies at regular intervals for unknown reasons.

2017-02-07 Thread Kang Minwoo
Yes. I agree with you. But I can not upgrade right away. The problem is that region servers that have received a particular region continue to die. I got the name of that region. What would I do to find out if a region server dies when it receives a region? Thanks. _

Re: Hbase REST returns invalid status code

2017-02-07 Thread Ted Yu
Which release of hbase are you using ? I was searching using the snippet from AsyncProcess and found HBASE-14431. FYI On Tue, Feb 7, 2017 at 6:29 PM, Akshat Mahajan wrote: > We are seeing exceptions where Hbase REST is returning an invalid status > code on writes, causing our Python processes

Hbase REST returns invalid status code

2017-02-07 Thread Akshat Mahajan
We are seeing exceptions where Hbase REST is returning an invalid status code on writes, causing our Python processes to error out on trying to make a write (the following error is thrown by the Python starbase library, but is not caused by starbase directly): `requests.exceptions.ConnectionErr

Re: Seeking advice on skipped/lost data during data migration from and to a hbase table

2017-02-07 Thread Sean Busbey
Unfortunately, as far as I could tell there are no logs or metrics on either side to indicate that heartbeats are happening. So I'd say the only option is to verify that items are missing, then apply the fix from HBASE-15378 and see if the problem clears up. On Tue, Feb 7, 2017 at 12:34 PM, Alexa

Re: Seeking advice on skipped/lost data during data migration from and to a hbase table

2017-02-07 Thread Alexandre Normand
I agree we should upgrade to get this fix and I moved this to a cloudera support case to get a patched release. I'm hoping that maybe we can get more confidence that this is it but I'm moving this to cloudera support. Thanks for the help! On Tue, Feb 7, 2017 at 12:45 PM Ted Yu wrote: > There is

Re: Seeking advice on skipped/lost data during data migration from and to a hbase table

2017-02-07 Thread Ted Yu
There is not much in the log which would indicate the trigger of this bug. >From the information presented on this thread, it is highly likely that once you deploy build with the fix from HBASE-15378, you would get consistent result from your map tasks. I suggest you arrange upgrade of your clust

Re: Seeking advice on skipped/lost data during data migration from and to a hbase table

2017-02-07 Thread Alexandre Normand
Thanks for the correction, Sean. I'm thinking of trying to reproduce the problem on a non-production cluster using the same migration job that I was talking about in my original post (we have similar data as production on a non-prod cluster) but then, I'm not sure how to validate that what we're e

Re: Seeking advice on skipped/lost data during data migration from and to a hbase table

2017-02-07 Thread Sean Busbey
HBASE-15378 says that it was caused by HBASE-13090, I think. That issue is present in CDH5.5.4: http://archive.cloudera.com/cdh5/cdh/5/hbase-1.0.0-cdh5.5.4.releasenotes.html (Search in page for HBASE-13090) On Tue, Feb 7, 2017 at 11:51 AM, Alexandre Normand wrote: > Reporting back with some re

Re: Seeking advice on skipped/lost data during data migration from and to a hbase table

2017-02-07 Thread Alexandre Normand
Reporting back with some results. We ran several RowCounters and each one gives us the same count back. It could be because RowCounter is much more lightweight than our migration job (which reads every cell and turns back to write an equivalent version in another table) but it's hard to tell. Tak