Re: RegionServer shutdown by some unknown reason.

2016-08-29 Thread Ted Yu
Please use user@ in the future. You said: zk session timout is 40s Default value is 90s. Why did you configure it with lower value ? The "RegionServer ephemeral node deleted" message means that znode for olap3.data.lq,16020,1470799848293 expired. Can you pastebin JVM parameters (are you using CM

Re: regionserver shutdown abnormally

2015-12-23 Thread Ted Yu
Which hbase release are you using ? After a brief search, looks like Chinese char might be present in region name or config value. Can you double check ? On Wed, Dec 23, 2015 at 10:04 PM, yaoxiaohua wrote: > Hi, > > 172.19.206.142 ,this node is running datanode and > regionserv

regionserver shutdown abnormally

2015-12-23 Thread yaoxiaohua
Hi, 172.19.206.142 ,this node is running datanode and regionserver, now the region server always shutdown sometimes, the following is some log, Could you help me analysis why ? 2015-12-05 10:20:41,570 WARN [PostOpenDeployTasks:b7b84410963cbc1484827ceca3439658] handler.OpenRe

Re: RegionServer Shutdown

2015-07-09 Thread divye sheth
Hi, No errors reported. I want to bring to your notice that this started after I replaced the hadoop 2.2.0 jars in the hbase lib with the hadoop 2.6.0 jars. Thanks! On Thu, Jul 9, 2015 at 1:35 PM, Dejan Menges wrote: > Hi, > > Can you add -x in your start-hbase.sh and try to run it then, mayb

Re: RegionServer Shutdown

2015-07-09 Thread Dejan Menges
Hi, Can you add -x in your start-hbase.sh and try to run it then, maybe it will tell you something more about some missing path/folder etc? On Thu, Jul 9, 2015 at 8:09 AM divye sheth wrote: > Hi Samir, > > While debugging I found the following. When I extract the hbase tar and run > it with ma

Re: RegionServer Shutdown

2015-07-08 Thread divye sheth
Hi Samir, While debugging I found the following. When I extract the hbase tar and run it with making only configuration changes the script works fine i.e. start-hbase.sh starts regionserver properly. Now since I am using hadoop-2.6.0, I replaced all the jars related to hadoop in the $HBASE_HOME/l

Re: RegionServer Shutdown

2015-07-07 Thread Samir Ahmic
OK then it is something related to how classes are loaded in case you start hbase with start-hbase.sh script. start-hbase.sh script i using ssh for starting hbase daemons if you are running hbase in distributed mode (this is case in pseudo-distributed mode to) so i'm suggesting that you check your

Re: RegionServer Shutdown

2015-07-07 Thread divye sheth
Hi Samir, The output of hadoop classpath command lists the directory $HADOOP_PREFIX/share/hadoop/common/lib/* inside this location resides the htrace-core-3.0.4.jar file. Could it be a version issue? Since hbase comes with htrace-core-2.04.jar And as I said, the regionserver starts fine if starte

Re: RegionServer Shutdown

2015-07-07 Thread Samir Ahmic
Hi, It look like you are missing htrace jar in your hadoop classpath. You can check it with: $ hadoop classpath | tr ":" "\n" | grep htrace If it is not in classpath you will need to include it in hadop classpth. HTrace jar is located in $HBASE_HOME/lib. Regards Samir On Tue, Jul 7, 2015 at 1:15

RegionServer Shutdown

2015-07-07 Thread divye sheth
Hi, I have installed Hbase-0.98 over Hadoop 2.6.0 in a psuedo distributed mode with zookeeper managed seperately. Everything works fine and I am even able to access hbase cluster without any issues when started using hbase-daemon.sh script. The problem I am facing is that the regionserver immedia

Re: RegionServer shutdown with ScanWildcardColumnTracker exception

2013-04-17 Thread Amit Sela
No. It happened in our production environment after running counters increments every 5 minutes for a few weeks now. I could try to reproduce in test cluster environment but that would mean running for weeks as well... but I will keep digging and let you guys know if it happens again or / and I hav

Re: RegionServer shutdown with ScanWildcardColumnTracker exception

2013-04-17 Thread ramkrishna vasudevan
Is there any testcases that tries to reproduce your issue? Regards Ram On Wed, Apr 17, 2013 at 9:47 PM, ramkrishna vasudevan < ramkrishna.s.vasude...@gmail.com> wrote: > There is a hint mechanism available when scanning happens. But i dont > think there should be much of difference between a s

Re: RegionServer shutdown with ScanWildcardColumnTracker exception

2013-04-17 Thread ramkrishna vasudevan
There is a hint mechanism available when scanning happens. But i dont think there should be much of difference between a scan that happens during flush and the normal scan. Will look thro the code and come back on this. Regards RAm On Wed, Apr 17, 2013 at 9:40 PM, Amit Sela wrote: > No, no e

Re: RegionServer shutdown with ScanWildcardColumnTracker exception

2013-04-17 Thread Amit Sela
No, no encoding. On Wed, Apr 17, 2013 at 6:56 PM, ramkrishna vasudevan < ramkrishna.s.vasude...@gmail.com> wrote: > @Lars > You have any suggestions on this? > > @Amit > You have any Encoder enabled like the Prefix Encoding stuff? > There was one optimization added recently but that is not in 0.

Re: RegionServer shutdown with ScanWildcardColumnTracker exception

2013-04-17 Thread ramkrishna vasudevan
@Lars You have any suggestions on this? @Amit You have any Encoder enabled like the Prefix Encoding stuff? There was one optimization added recently but that is not in 0.94.2 Regards Ram On Wed, Apr 17, 2013 at 5:17 PM, Amit Sela wrote: > I scanned over this counter with and without column sp

Re: RegionServer shutdown with ScanWildcardColumnTracker exception

2013-04-17 Thread Amit Sela
I scanned over this counter with and without column specification and all looks OK now. I have no CPs in this table. Is there some kind of a hint mechanism in HBase' internal scan ? because it's weird that ScanWildcardColumnTracker.checkColumn says that column is smaller than previous column: *impr

Re: RegionServer shutdown with ScanWildcardColumnTracker exception

2013-04-17 Thread ramkrishna vasudevan
Hi Amit Checking the code this is possible when the qualifiers are not sorted. Do you have any CPs in your path which tries to play with the KVs? Seems to be a very weird thing. Can you try doing a scan on the KV just before this happens. That will tel you the existing kvs that are present. Ev

Re: RegionServer shutdown with ScanWildcardColumnTracker exception

2013-04-17 Thread Amit Sela
The cluster runs Hadoop 1.0.4 and HBase 0.94.2 I have three families in this table: weekly, daily, hourly. each family has the following qualifiers: Weekly - impressions_{countrycode}_{week#} - country code is 0, 1 or ALL (aggregation of both 0 and 1) Daily and hourly are the same but with MMd

Re: RegionServer shutdown with ScanWildcardColumnTracker exception

2013-04-17 Thread ramkrishna vasudevan
Seems interesting. Can you tell us what are the families and the qualifiers available in your schema. Any other interesting logs that you can see before this? BTW the version of HBase is also needed? If we can track it out we can then file a JIRA if it is a bug. Regards RAm On Wed, Apr 17,

RegionServer shutdown with ScanWildcardColumnTracker exception

2013-04-17 Thread Amit Sela
Hi all, I had a regionserver crushed during counters increment. Looking at the regionserver log I saw: org.apache.hadoop.hbase.DroppedSnapshotException: region: TABLE_NAME, ROW_KEY...at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1472) at org.apache.hadoop

Re: xceiver count, regionserver shutdown

2012-02-07 Thread Jean-Daniel Cryans
Hurray! Thanks for following up, I really appreciate it. The upload speed should be better too. I should some time to write this all down in a more readable format for the reference guide on the website. J-D On Tue, Feb 7, 2012 at 5:13 AM, Bryan Keller wrote: > Just to follow up, this did indee

Re: xceiver count, regionserver shutdown

2012-02-07 Thread Bryan Keller
Just to follow up, this did indeed fix the problem I was having with hitting the xceiver limit. Thanks a bunch for the help, I have a much better understanding of how heap, memstore size, and number of regions all play a role in performance and resource usage. On Feb 6, 2012, at 5:03 PM, Jean-D

Re: xceiver count, regionserver shutdown

2012-02-06 Thread Jean-Daniel Cryans
On Mon, Feb 6, 2012 at 4:47 PM, Bryan Keller wrote: > I increased the max region file size to 4gb so I should have fewer than 200 > regions per node now, more like 25. With 2 column families that will be 50 > memstores per node. 5.6gb would then flush files of 112mb. Still not close to > the me

Re: xceiver count, regionserver shutdown

2012-02-06 Thread Bryan Keller
I increased the max region file size to 4gb so I should have fewer than 200 regions per node now, more like 25. With 2 column families that will be 50 memstores per node. 5.6gb would then flush files of 112mb. Still not close to the memstore limit but shouldn't I be much better off than before?

Re: xceiver count, regionserver shutdown

2012-02-06 Thread Jean-Daniel Cryans
Good but... Keep in mind that if you just increase max filesize and memstore size without changing anything else then you'll be in the same situation except with 16GB it'll take just a bit more time to get there. Here's the math: 200 regions of 2 families means 400 memstores to fill. Assuming a

Re: xceiver count, regionserver shutdown

2012-02-06 Thread Bryan Keller
Yes, insert pattern is random, and yes, the compactions are going through the roof. Thanks for pointing me in that direction. I am going to try increasing the region max filesize to 4gb (it was set to 512mb) and the memstore flush size to 512mb (it was 128mb). I'm also going to increase the hea

Re: xceiver count, regionserver shutdown

2012-02-06 Thread Jean-Daniel Cryans
Ok this helps, we're still missing your insert pattern regarding but I bet it's pretty random considering what's happening to your cluster. I'm guessing you didn't set up metrics else you would have told us that the compaction queues are through the roof during the import, but at this point I'm pr

Re: xceiver count, regionserver shutdown

2012-02-06 Thread Bryan Keller
This is happening during heavy update. I have a "wide" table with around 4 million rows that have already been inserted. I am adding billions of columns to the rows. Each row can have 20+k columns. I perform the updates in batch, i.e. I am using the HTable.put(List) API. The batch size is 1000

Re: xceiver count, regionserver shutdown

2012-02-06 Thread Jean-Daniel Cryans
The number of regions is the first thing to check, then it's about the actual number of blocks opened. Is the issue happening during a heavy insert? In this case I guess you could end up with hundreds of opened files if the compactions are piling up. Setting a bigger memstore flush size would defin

xceiver count, regionserver shutdown

2012-02-06 Thread Bryan Keller
I am trying to resolve an issue with my cluster when I am loading a bunch of data into HBase. I am reaching the "xciever" limit on the data nodes. Currently I have this set to 4096. The data node is logging "xceiverCount 4097 exceeds the limit of concurrent xcievers 4096". The regionservers even

Re: all regionserver shutdown after close hdfs datanode

2010-12-22 Thread Stack
06 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer > Exception: java.io.IOException: Connection reset by peer > > The hbase version I use is 0.20.6, not 0.89. > > Zhou > > -邮件原件- > 发件人: saint@gmail.com [mailto:saint@gmail.com] 代表 Stack > 发送时间: 2010年1

Re: all regionserver shutdown after close hdfs datanode

2010-12-22 Thread Zhou Shuaifeng
间: 2010年12月22日 3:12 收件人: user@hbase.apache.org 主题: Re: all regionserver shutdown after close hdfs datanode 2010/12/20 Zhou Shuaifeng : > Hi, > I checked the log, It's not the master caused the regionserver shutdown, but > the regionserver log rolling failed caused regionserver sh

Re: all regionserver shutdown after close hdfs datanode

2010-12-21 Thread Stack
2010/12/20 Zhou Shuaifeng : > Hi, > I checked the log, It's not the master caused the regionserver shutdown, but > the regionserver log rolling failed caused regionserver shutdown. > The problem block only had one replica? If you look in the hdfs emissions, it'll usuall

Re: all regionserver shutdown after close hdfs datanode

2010-12-20 Thread Zhou Shuaifeng
Hi, I checked the log, It's not the master caused the regionserver shutdown, but the regionserver log rolling failed caused regionserver shutdown. According the log, error occurred in the pipeline, but why hdfs are not able to select another good data node when one datanode in the pipeline i

Re: all regionserver shutdown after close hdfs datanode

2010-12-20 Thread Daniel Iancu
Hi Zhou You should check if the HMaster is still up. If not, check its logs, if for some reason HMaster thinks HDFS is not available it will shutdown the HBase cluster. Regards Daniel On 12/20/2010 06:15 AM, Zhou Shuaifeng wrote: Hi, I have a cluster of 8 hdfs datanodes and 8 hbase regions

Re all regionserver shutdown after close hdfs datanode

2010-12-20 Thread Zhou Shuaifeng
送时间: 2010年12月20日 12:35 收件人: user@hbase.apache.org 主题: Re: all regionserver shutdown after close hdfs datanode Hi Shuaifeng What about your hdfs's version? Mybe this can solve this problem: This is the current list of patches we recommend you apply to your running Hadoop cluster: - HDF

Re: all regionserver shutdown after close hdfs datanode

2010-12-19 Thread 陈加俊
Hi Shuaifeng What about your hdfs's version? Mybe this can solve this problem: This is the current list of patches we recommend you apply to your running Hadoop cluster: - HDFS-630: *"In DFSOutputStream.nextBlockOutputStream(), the client can exclude specific datanodes when locating the n

all regionserver shutdown after close hdfs datanode

2010-12-19 Thread Zhou Shuaifeng
Hi, I have a cluster of 8 hdfs datanodes and 8 hbase regionservers. When I shutdown one node(a pc with one datanode and one regionserver running), all hbase regionservers shutdown after a while. Other 7 hdfs datanodes is OK. I think it's not reasionable. Hbase is a distribute system that

Re: RegionServer shutdown - how to recover?

2010-09-27 Thread Jean-Daniel Cryans
If an old recovered.edits was already there from a previous failure from before your upgrade on Sept. 5th then it's sufficient to explain the problem you encountered. J-D On Sun, Sep 26, 2010 at 8:09 AM, Matthew LeMieux wrote: > I took a look at the bug (https://issues.apache.org/jira/browse/HBA

Re: RegionServer shutdown - how to recover?

2010-09-26 Thread Matthew LeMieux
I took a look at the bug (https://issues.apache.org/jira/browse/HBASE-3027). The issue I'm experiencing does not appear to be a result of deploying a new version. I've been on the same version since Sept 5th. -Matthew On Sep 22, 2010, at 9:01 PM, Stack wrote: > Could it be HBASE-3027? (Did

Re: RegionServer shutdown - how to recover?

2010-09-22 Thread Stack
Could it be HBASE-3027? (Did you upgrade recently?) St.Ack On Wed, Sep 22, 2010 at 8:21 PM, Matthew LeMieux wrote: > That would be an awfully long GC pause.  GC logging was not enabled on that > particular machine (it is now). > > The only exception in that part of the log file, > "java.io.File

Re: RegionServer shutdown - how to recover?

2010-09-22 Thread Matthew LeMieux
That would be an awfully long GC pause. GC logging was not enabled on that particular machine (it is now). The only exception in that part of the log file, "java.io.FileNotFoundException: Parent path is not a directory" occurred quite a few times, although HDFS seems to think of itself as be

Re: RegionServer shutdown - how to recover?

2010-09-22 Thread Jean-Daniel Cryans
It seem that the log splitting didn't succeed (can't tell why, small log snippet is too small). Did it get stuck on EOF or something like that? In any case, it looks like a bug. >   * how do I avoid disruption when this sort of thing happens Your region server was unavailable for 3 minutes, proba

RegionServer shutdown - how to recover?

2010-09-22 Thread Matthew LeMieux
I've just had another of those region server shutdowns (RegionServerSuicide). * how do I avoid disruption when this sort of thing happens * what is the best procedure for recovering from such a thing (i.e., is there something to be done other than simply restarting the region server?) I'