Re:Re: How to know the root reason to cause RegionServer OOM?

2015-05-13 Thread David chen
Thanks for you reply. Yes, it indeed appeared in the RegionServer command as follows: jps -v|grep "Region" HRegionServer -Dproc_regionserver -XX:OnOutOfMemoryError=kill -9 %p -Xmx1000m -Djava.net.preferIPv4Stack=true -Xms16106127360 -Xmx16106127360 -XX:+UseG1GC -XX:MaxGCPauseMillis=6000 -XX:OnOu

client Table instance, confused with autoFlush

2015-05-13 Thread Serega Sheypak
Hi, in 0.94 we could use autoFlush method for HTable. Now HTable shouldn't be used, we refactoring code for Table Here is a note: http://hbase.apache.org/book.html#perf.hbase.client.autoflush >When performing a lot of Puts, make sure that setAutoFlush is set to false on your Table

Re: client Table instance, confused with autoFlush

2015-05-13 Thread Ted Yu
Please take a look at https://issues.apache.org/jira/browse/HBASE-12728 Cheers On Wed, May 13, 2015 at 6:25 AM, Serega Sheypak wrote: > Hi, in 0.94 we could use autoFlush method for HTable. > Now HTable shouldn't be used, we refactoring code for Table > > Here is a note: > http://hbase.apache.o

Re: client Table instance, confused with autoFlush

2015-05-13 Thread Solomon Duskis
BufferedMutator is the preferred alternative for autoflush starting in HBase 1.0. Get a connection via ConnectionFactory, then connection.getBufferedMutator(tableName). It's the same functionality as autoflush under the covers. On Wed, May 13, 2015 at 9:41 AM, Ted Yu wrote: > Please take a loo

Re: client Table instance, confused with autoFlush

2015-05-13 Thread Serega Sheypak
We are using CDH 5.4, it's on .0.98 version 2015-05-13 16:49 GMT+03:00 Solomon Duskis : > BufferedMutator is the preferred alternative for autoflush starting in > HBase 1.0. Get a connection via ConnectionFactory, then > connection.getBufferedMutator(tableName). It's the same functionality as >

Re: client Table instance, confused with autoFlush

2015-05-13 Thread Solomon Duskis
The docs you referenced are for 1.0. Table and BufferedMutator were introduced in 1.0. In 0.98, you should continue using HTable and autoflush. On Wed, May 13, 2015 at 9:57 AM, Serega Sheypak wrote: > We are using CDH 5.4, it's on .0.98 version > > 2015-05-13 16:49 GMT+03:00 Solomon Duskis : >

Re: client Table instance, confused with autoFlush

2015-05-13 Thread Serega Sheypak
But HTable is deprecated in 0.98 ...? 2015-05-13 17:35 GMT+03:00 Solomon Duskis : > The docs you referenced are for 1.0. Table and BufferedMutator were > introduced in 1.0. In 0.98, you should continue using HTable and > autoflush. > > On Wed, May 13, 2015 at 9:57 AM, Serega Sheypak > wrote: >

Re: client Table instance, confused with autoFlush

2015-05-13 Thread Shahab Yunus
Until you move to HBase 1.*, you should use HTableInterface. And the autoFlush methods and semantics, as far as I understand are, same so you should not have problem. Regards, Shahab On Wed, May 13, 2015 at 11:09 AM, Serega Sheypak wrote: > But HTable is deprecated in 0.98 ...? > > 2015-05-13 1

Re: client Table instance, confused with autoFlush

2015-05-13 Thread Serega Sheypak
Ok, thanks! 2015-05-13 18:14 GMT+03:00 Shahab Yunus : > Until you move to HBase 1.*, you should use HTableInterface. And the > autoFlush methods and semantics, as far as I understand are, same so you > should not have problem. > > Regards, > Shahab > > On Wed, May 13, 2015 at 11:09 AM, Serega She

Re: Problems with Phoenix and HBase

2015-05-13 Thread Ted Yu
Putting dev@ to bcc The question w.r.t. sqlline should be posted to Phoenix user mailing list. w.r.t. question on hbase shell, can you give us more information ? release of hbase you use was there exception from hbase shell when no tables were returned ? master log snippet when above happened On

Re: Re: How to know the root reason to cause RegionServer OOM?

2015-05-13 Thread Elliott Clark
On Wed, May 13, 2015 at 12:59 AM, David chen wrote: > -XX:MaxGCPauseMillis=6000 With this line you're basically telling java to never garbage collect. Can you try lowering that to something closer to the jvm default and see if you have better stability?

MR against snapshot causes High CPU usage on Datanodes

2015-05-13 Thread rahul malviya
Hi, I have recently started running MR on hbase snapshots but when the MR is running there is pretty high CPU usage on datanodes and I start seeing IO wait message in datanode logs and as soon I kill the MR on Snapshot everything come back to normal. What could be causing this ? I am running cdh

Re: readAtOffset error when reading from HFiles

2015-05-13 Thread donmai
Ahhh, thanks for that. Yep, all flushes will (or should be) going to S3. I'm working through it and it seems that it's defaulting to the positional read instead of seek+read - is this accurate? On Tue, May 12, 2015 at 12:00 PM, Ted Yu wrote: > -w is shorthand for --seekToRow (so it is not offset

Re: MR against snapshot causes High CPU usage on Datanodes

2015-05-13 Thread Ted Yu
Have you enabled short circuit read ? Cheers On Wed, May 13, 2015 at 9:37 AM, rahul malviya wrote: > Hi, > > I have recently started running MR on hbase snapshots but when the MR is > running there is pretty high CPU usage on datanodes and I start seeing IO > wait message in datanode logs and a

Re: MR against snapshot causes High CPU usage on Datanodes

2015-05-13 Thread rahul malviya
Yes. On Wed, May 13, 2015 at 9:40 AM, Ted Yu wrote: > Have you enabled short circuit read ? > > Cheers > > On Wed, May 13, 2015 at 9:37 AM, rahul malviya > > wrote: > > > Hi, > > > > I have recently started running MR on hbase snapshots but when the MR is > > running there is pretty high CPU us

Re: MR against snapshot causes High CPU usage on Datanodes

2015-05-13 Thread rahul malviya
Short circuit read can cause it to spike CPU and IO wait issues ? Rahul On Wed, May 13, 2015 at 9:41 AM, rahul malviya wrote: > Yes. > > On Wed, May 13, 2015 at 9:40 AM, Ted Yu wrote: > >> Have you enabled short circuit read ? >> >> Cheers >> >> On Wed, May 13, 2015 at 9:37 AM, rahul malviya <

Re: MR against snapshot causes High CPU usage on Datanodes

2015-05-13 Thread Michael Segel
Without knowing your exact configuration… The High CPU may be WAIT IOs, which would mean that you’re cpu is waiting for reads from the local disks. What’s the ratio of cores (physical) to disks? What type of disks are you using? That’s going to be the most likely culprit. > On May 13, 201

Re: Re: How to know the root reason to cause RegionServer OOM?

2015-05-13 Thread Ted Yu
For #2, partial row would be returned. Please take a look at the following method in RSRpcServices around line 2393 : public ScanResponse scan(final RpcController controller, final ScanRequest request) Cheers On Wed, May 13, 2015 at 12:59 AM, David chen wrote: > Thanks for you reply. > Yes,

Re: MR against snapshot causes High CPU usage on Datanodes

2015-05-13 Thread rahul malviya
*The High CPU may be WAIT IOs, which would mean that you’re cpu is waiting for reads from the local disks.* Yes I think thats what is going on but I am trying to understand why it happens only in case of snapshot MR but if I run the same job without using snapshot everything is normal. What is th

Re: MR against snapshot causes High CPU usage on Datanodes

2015-05-13 Thread Anil Gupta
How many mapper/reducers are running per node for this job? Also how many mappers are running as data local mappers? You load/data equally distributed? Your disk, cpu ratio looks ok. Sent from my iPhone > On May 13, 2015, at 10:12 AM, rahul malviya > wrote: > > *The High CPU may be WAIT IOs,

Re: MR against snapshot causes High CPU usage on Datanodes

2015-05-13 Thread rahul malviya
*How many mapper/reducers are running per node for this job?* I am running 7-8 mappers per node. The spike is seen in mapper phase so no reducers where running at that point of time. *Also how many mappers are running as data local mappers?* How to determine this ? * You load/data equally distri

Re: How to know the root reason to cause RegionServer OOM?

2015-05-13 Thread Stack
On Tue, May 12, 2015 at 7:41 PM, David chen wrote: > A RegionServer was killed because OutOfMemory(OOM), although the process > killed can be seen in the Linux message log, but i still have two following > problems: > 1. How to inspect the root reason to cause OOM? > Start the regionserver with

Re: MR against snapshot causes High CPU usage on Datanodes

2015-05-13 Thread Michael Segel
So … First, you’re wasting money on 10K drives. But that could be your company’s standard. Yes, you’re going to see red. 24 / 12 , so is that 12 physical cores or 24 physical cores? I suspect those are dual chipped w 6 physical cores per chip. That’s 12 cores to 12 disks, which is ok.

Re: How to know the root reason to cause RegionServer OOM?

2015-05-13 Thread Bryan Beaudreault
After moving to the G1GC we were plagued with random OOMs from time to time. We always thought it was due to people requesting a big row or group of rows, but upon investigation noticed that the heap dumps were many GBs less than the max heap at time of OOM. If you have this symptom, you may be r

Re: MR against snapshot causes High CPU usage on Datanodes

2015-05-13 Thread Esteban Gutierrez
rahul, You might want to look into your MR counters too, if your tasks are spilling too much to disk or the shuffling phase is too large that might cause lots of contention. Also you might want to look into the OS/drives settings too (write cache off or irqbalance off) as Michael said, CPU might n

HBase MapReduce in Kerberized cluster

2015-05-13 Thread Edward C. Skoviak
I'm attempting to write a Crunch pipeline to read various rows from a table in HBase and then do processing on these results. I am doing this from a cluster deployed using CDH 5.3.2 running Kerberos and YARN. I was hoping to get an answer on what is considered the best approach to authenticate to

Re: HBase MapReduce in Kerberized cluster

2015-05-13 Thread Ted Yu
bq. it has been moved to be a part of the hbase-server package I searched (current) 0.98 and branch-1 where I found: ./hbase-client/src/main/java/org/apache/hadoop/hbase/security/token/TokenUtil.java FYI On Wed, May 13, 2015 at 11:45 AM, Edward C. Skoviak < edward.skov...@gmail.com> wrote: > I'

Hive/Hbase Integration issue

2015-05-13 Thread Ibrar Ahmed
Hi, I am creating a table using hive and getting this error. [127.0.0.1:1] hive> CREATE TABLE hbase_table_1(key int, value string) > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' > WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key

Re: MR against snapshot causes High CPU usage on Datanodes

2015-05-13 Thread anil gupta
Inline. On Wed, May 13, 2015 at 10:31 AM, rahul malviya wrote: > *How many mapper/reducers are running per node for this job?* > I am running 7-8 mappers per node. The spike is seen in mapper phase so no > reducers where running at that point of time. > > *Also how many mappers are running as da

Re: Hive/Hbase Integration issue

2015-05-13 Thread Talat Uyarer
This issue similar some missing settings. What do you for your Hive Hbase integration ? Can you give some information about your cluster ? BTW In [1], someone had same issue. Maybe help you [1] http://mail-archives.apache.org/mod_mbox/hive-user/201307.mbox/%3cce01cda1.9221%25sanjay.subraman...@w

Re: Hive/Hbase Integration issue

2015-05-13 Thread Ibrar Ahmed
Here is my hbase-site.xml hbase.rootdir file:///usr/local/hbase hbase.zookeeper.property.dataDir /usr/local/hbase/zookeeperdata And hive-site.xml hive.aux.jars.path file:///usr/local/hive/lib/zookeeper-3.4.5.jar,file:/usr/local/hive/lib/hive-hbase-handler-0

Re: Hive/Hbase Integration issue

2015-05-13 Thread Talat Uyarer
Your Zookeeper managed by Hbase. Could you check your hbase.zookeeper.quorum settings. It should be same with Hbase Zookeeper. Talat 2015-05-13 23:03 GMT+03:00 Ibrar Ahmed : > Here is my hbase-site.xml > > > > hbase.rootdir > file:///usr/local/hbase > > > hbase.zookeeper.pro

Re: RowKey hashing in HBase 1.0

2015-05-13 Thread jeremy p
Thank you for your response. However, I'm still having a hard time understanding you. Apologies for this. So, this is where I think I'm getting confused : Let's talk about the original rowkey, before anything has been prepended to it. Let's call this original_rowkey. Let's say your first orig

Troubles with HBase 1.1.0 RC2

2015-05-13 Thread James Estes
I saw the vote thread for RC2, so tried to build my project against it. My build fails when I depend on 1.1.0. I created a bare bones project to show the issue I'm running into: https://github.com/housejester/hbase-deps-test To be clear, it works in 1.0.0 (and I did add the repository). Further,

Re: Troubles with HBase 1.1.0 RC2

2015-05-13 Thread Andrew Purtell
> So, it looks like RegionCoprocessorEnvironment.getRegion() has been removed? No, the signature has changed, basically s/HRegion/Region/. HRegion is an internal, low level implementation type. Has always been. We have replaced it with Region, an interface that contains a subset of HRegion we feel

Re: New post on hbase-1.1.0 throttling feature up on our Apache blog

2015-05-13 Thread Nick Dimiduk
This is a great demonstration of these new features, thanks for pointing it out Stack. I'm curious: what percentile latencies are this reported? Does the non-throttled user see significant latency improvements in the 95, 99pct when the competing, scanning users are throttled? MB/s and req/s are ma

Re: Questions related to HBase general use

2015-05-13 Thread Nick Dimiduk
+ Swarnim, who's expert on HBase/Hive integration. Yes, snapshots may be interesting for you. I believe Hive can access HBase timestamps, exposed as a "virtual" column. It's assumed across there whole row however, not per cell. On Sun, May 10, 2015 at 9:14 PM, Jerry He wrote: > Hi, Yong > > You

Re: New post on hbase-1.1.0 throttling feature up on our Apache blog

2015-05-13 Thread Govind Kamat
> This is a great demonstration of these new features, thanks for pointing it > out Stack. > > I'm curious: what percentile latencies are this reported? Does the > non-throttled user see significant latency improvements in the 95, 99pct > when the competing, scanning users are throttled? MB/

Re: New post on hbase-1.1.0 throttling feature up on our Apache blog

2015-05-13 Thread Stack
Should we add in your comments on the blog Govind: i.e. the answers to Nicks' questions? St.Ack On Wed, May 13, 2015 at 5:48 PM, Govind Kamat wrote: > > This is a great demonstration of these new features, thanks for > pointing it > > out Stack. > > > > I'm curious: what percentile latencies

Re: New post on hbase-1.1.0 throttling feature up on our Apache blog

2015-05-13 Thread Nick Dimiduk
Sorry. Yeah, sure, I can ask over there. The throttle was set by user in these tests. You cannot directly > throttle a specific job, but do have the option to set the throttle > for a table or a namespace. That might be sufficient for you to > achieve your objective (unless those jobs are run by

Re: New post on hbase-1.1.0 throttling feature up on our Apache blog

2015-05-13 Thread Matteo Bertozzi
@nick what would you like to have? a match on a Job ID or something like that? currently only user/table/namespace are supported, but group support can be easily added. not sure about a job-id or job-name since we don't have that info on the scan. On Wed, May 13, 2015 at 6:04 PM, Nick Dimiduk wro

Re: New post on hbase-1.1.0 throttling feature up on our Apache blog

2015-05-13 Thread Nick Dimiduk
I guess what I'm thinking of is more about scheduling than quota/throttling. I don't want my online requests to sit in a queue behind MR requests while the MR work build up to it's quota amount. I want a scheduler to do time-slicing of operations, with preferential treatment given to online work ov

Re: New post on hbase-1.1.0 throttling feature up on our Apache blog

2015-05-13 Thread Matteo Bertozzi
@nick we have already something like that, which is HBASE-10993 and it is basically reordering requests based on how many scan.next you did. (see the picture) http://blog.cloudera.com/wp-content/uploads/2014/11/hbase-multi-f2.png the problem is that we can't eject requests in execution and we are n

Re: Troubles with HBase 1.1.0 RC2

2015-05-13 Thread Enis Söztutar
Yeah, for coprocessors, what Andrew said. You have to make minor changes. >From your repo, I was able to build: HW10676:hbase-deps-test$ ./build.sh :compileJava Download https://repository.apache.org/content/repositories/orgapachehbase-1078/org/apache/hbase/hbase/1.1.0/hbase-1.1.0.pom Download

Re: HBase Block locality always 0

2015-05-13 Thread 娄帅
anyidea? 2015-05-12 17:59 GMT+08:00 娄帅 : > Hi, all, > > I am maintaining an hbase 0.96.0 cluster, but from the web ui of HBase > regionserver, > i saw Block locality is 0 for all regionserver. > > Datanode on l-hbase[26-31].data.cn8 and regionserver on > l-hbase[25-31].data.cn8, > > Any idea? >

Re: HBase Block locality always 0

2015-05-13 Thread Dima Spivak
Have you seen Esteban's suggestion? Another possibility is that a number of old JIRAs covered the fact that regions were assigned in a silly way when a table was disabled and then enabled. Could this be the case for you? -Dima On Wed, May 13, 2015 at 8:36 PM, 娄帅 wrote: > anyidea? > > 2015-05-12

Re: New post on hbase-1.1.0 throttling feature up on our Apache blog

2015-05-13 Thread ramkrishna vasudevan
I think related to this would be HBASE-12790 where we would do a round robin scheduling and thus helps the shorter scans also to get some time slice to get the execution cycle. On Thu, May 14, 2015 at 7:12 AM, Matteo Bertozzi wrote: > @nick we have already something like that, which is HBASE-109

Re: Questions related to HBase general use

2015-05-13 Thread Krishna Kalyan
I know that BigInsights comes with BigSQL which interacts with HBase as well, have you considered that option. We have a similar use case using BigInsights 2.1.2. On Thu, May 14, 2015 at 4:56 AM, Nick Dimiduk wrote: > + Swarnim, who's expert on HBase/Hive integration. > > Yes, snapshots may be

Problem tin scanning hbase in reducer phase of mapreduce

2015-05-13 Thread Priya
I am using hadoop 1.2.1 and hbase 0.94.12 . I have a scenario where in reducer phase i have to read data check if the key of map is already inserted and then put in to hbase table. When i tried in single node all gets and scans worked. But when i tried with 3 node cluster, scanning does not work. C