Hello all,
In an AWS outtage we lost about a 5th of our regionservers, and about an
8th of our total datanodes. Despite a replication factor of 3, it appears
we may have lost some data from corrupt HLogs. Looking at my hmaster I see
messages like this:
12/06/30 00:00:48 INFO wal.HLogSplitter:
I somewhat have HBase up and running in a distributed mode. It starts
fine, I can use hbase shell to create, disable, and drop tables;
however, after a short period of time HMaster and the HRegionalservers
terminate. Decoding the error messages is a bit bewildering and the
O'Reilly HBase book
Try cleaning up your zookeeper data.. I have had similar issues before due to
corrupt zookeeper data/bad zookeeper state
--
On Sat 30 Jun, 2012 4:12 AM IST Jay Wilson wrote:
I somewhat have HBase up and running in a distributed mode. It starts
fine, I can use
I'm new to HBase my self, and when first trying to learn its installation
path, I couldn't find a descent installation guide end-to-end (HDFS, HBase,
Linux specific stuff, etc).
I wrote an installation guide notes, which I'll be happy to expand into a
full fledge guide, if it can be added to the
On Sat, Jun 30, 2012 at 3:50 PM, Asaf Mesika asaf.mes...@gmail.com wrote:
I'm new to HBase my self, and when first trying to learn its installation
path, I couldn't find a descent installation guide end-to-end (HDFS, HBase,
Linux specific stuff, etc).
So, you complain about the hbase doc
On Sat, Jun 30, 2012 at 12:42 AM, Jay Wilson j...@circle-cross-jn.com wrote:
java.net.NoRouteToHostException: No route to host
I do not see how hbase config. could provoke the above. There is
something up w/ your base network setup.
St.Ack
On Sat, Jun 30, 2012 at 8:38 AM, Bryan Beaudreault
bbeaudrea...@hubspot.com wrote:
12/06/30 00:00:48 INFO wal.HLogSplitter: Got while parsing hlog
hdfs://my-namenode-ip-addr:8020/hbase/.logs/my-rs-ip-addr,60020,1338667719591/my-rs-ip-addr%3A60020.1340935453874.
Marking as corrupted
What size
On Sat, Jun 30, 2012 at 7:04 AM, Bryan Beaudreault
bbeaudrea...@hubspot.com wrote:
12/06/30 00:07:22 INFO ipc.Client: Retrying connect to server: /
10.125.18.129:50020. Already tried 14 time(s).
This was one of the servers that went down?
It was not following through the splitting of HLog
Bryan,
The master could not detect if the region server is dead.
How do you set the zookeeper session timeout?
Thanks,
Jimmy
On Sat, Jun 30, 2012 at 8:09 AM, Stack st...@duboce.net wrote:
On Sat, Jun 30, 2012 at 7:04 AM, Bryan Beaudreault
bbeaudrea...@hubspot.com wrote:
12/06/30 00:07:22
I've tried editing but I don't have permissions. What should be done to
obtain them?
Yeah I am. Look, using hbase over Linux and hdfs seems like the basic
installation for an hbase newbie from my perspective. Thus quick starting
this scheme could save time for many people.
Sent from my iPhone
They are all pretty large, around 40+mb. Will the walplayer be smart
enough to only write edits that still look relevant (i.e. based on
timestamps of the edits vs timestamps of the versions in hbase)? Writes
have been coming in since we recovered.
On Sat, Jun 30, 2012 at 11:05 AM, Stack
WALPlayer will look at the timestamp. Replaying an older edit that has
since been overwritten shouldn't change anything.
On Sat, Jun 30, 2012 at 9:49 AM, Bryan Beaudreault bbeaudrea...@hubspot.com
wrote:
They are all pretty large, around 40+mb. Will the walplayer be smart
enough to only
I should have mentioned in my initial email that I am operating on HBase
0.90.4. Is WALPlayer available in this version? I am having trouble
finding it or anything similar.
On Sat, Jun 30, 2012 at 1:14 PM, Li Pi l...@idle.li wrote:
WALPlayer will look at the timestamp. Replaying an older edit
Nope. It came out in 0.94 otoh.
On Sat, Jun 30, 2012 at 12:29 PM, Bryan Beaudreault
bbeaudrea...@hubspot.com wrote:
I should have mentioned in my initial email that I am operating on HBase
0.90.4. Is WALPlayer available in this version? I am having trouble
finding it or anything similar.
I had exactly the same behaviour some months ago.
Check your heapspaces of all hadoop services vs. available RAM for every
machine.
(machine memory should be higher than the sum of the services' heapspace)
In my case that solved the problem.
Von: Dhaval
To run HBase (or for that matter any distributed system) you need your
networking setup to function properly. No route to host is caused due to issues
with the underlying network. I have seen TORs losing packets, causing these
exceptions. There could be several other issues that could cause
This is interesting because I saw this happens in the past. Is walplayer can be
back ported to 0.90.x?
Best Regards,
Jerry
Sent from my iPad
On 2012-06-30, at 16:34, Li Pi l...@idle.li wrote:
Nope. It came out in 0.94 otoh.
On Sat, Jun 30, 2012 at 12:29 PM, Bryan Beaudreault
Hi all,
If you are still debugging high CPU usage on your java processes, read
this:
http://serverfault.com/questions/403732/anyone-else-experiencing-high-rates-of-linux-server-crashes-today
Hope this helps,
J-D
Thanks all for the additional input. I do not think the HLogs are
corrupted any longer, at least I think it was because we had also lost a
good portion of data nodes. We have since recovered all the datanodes, so
they are good.
We will look in to creating an executable jar out of your WALPlayer
19 matches
Mail list logo