This is basically a read bug/performance problem. The execution path
followed when the caching is used up is not consistent with the initial
execution path/performance. Can anyone help shed light on this? Was there
any changes to 0.94 to introduce this (we have not tested on other
versions)? Any
21, 2012, at 5:34 AM, Wayne wav...@gmail.com wrote:
Sorry but it would be too hard for us to be able to provide enough info
in
a Jira to accurately reproduce. Our read problem is through thrift and
has
everything to do with the row just being too big to bring back in its
entirety (13
wrote:
On Fri, Jan 20, 2012 at 11:43 AM, Wayne wav...@gmail.com wrote:
Does 0.92 support a significant increase in row size over 0.90.x? With
0.90.4 we have seen writes start choking at 30 million cols/row and reads
start choking at 10 million cols/row. Can we assume these numbers will go
It was the kernel...thanks for the help. We had considered but could not
really believe a slight kernel version difference would cause this.
Thanks
On Wed, Jan 18, 2012 at 11:47 PM, Stack st...@duboce.net wrote:
On Wed, Jan 18, 2012 at 8:40 AM, Wayne wav...@gmail.com wrote:
We have set up
As we load more and more data into HBase we are seeing the millions of
columns to be a challenge for us. We have some very wide rows and we are
taking 12-15 seconds to read those rows. Since HBase does not sort columns
and thereby can not support a scan of columns we are really seeing some
serious
I think you are right in that Thrift is all we see and it is very limited.
Comments in-line.
On Wed, Aug 10, 2011 at 8:33 PM, Stack st...@duboce.net wrote:
On Wed, Aug 10, 2011 at 2:39 AM, Wayne wav...@gmail.com wrote:
As we load more and more data into HBase we are seeing the millions
in this case.
Thanks!!
On Fri, Jul 8, 2011 at 6:12 PM, Stack st...@duboce.net wrote:
Going by the below where hdfs reports 173 lost blocks, I think the
only recourse is as you suggest in the below Wayne, a recovery mode
that goes through and sees what is out there and rebuilds the meta
based
@hbase.apache.org
Cc:
Sent: Sunday, July 3, 2011 12:39 AM
Subject: Re: hbck -fix
Wayne,
Did you by chance have your NameNode configured to write the edit log to
only
one disk, and in this case only the root volume of the NameNode host? As
I'm
sure you are now aware, the NameNode's
direct to fix what we need.
Thanks.
On Fri, Jul 1, 2011 at 5:03 PM, Stack st...@duboce.net wrote:
On Fri, Jul 1, 2011 at 8:32 AM, Wayne wav...@gmail.com wrote:
We are not in production so we have the luxury to start again, but the
damage to our confidence is severe. Is there work going
tried running check_meta.rb with --fix ?
On Sat, Jul 2, 2011 at 9:19 AM, Wayne wav...@gmail.com wrote:
We are running 0.90.3. We were testing the table export not realizing the
data goes to the root drive and not HDFS. The export filled the master's
root partition. The logger had issues
but some of the
.META. table entries were still there. Finally this afternoon we reformatted
the entire cluster.
Thanks.
On Sat, Jul 2, 2011 at 5:25 PM, Stack st...@duboce.net wrote:
On Sat, Jul 2, 2011 at 9:55 AM, Wayne wav...@gmail.com wrote:
It just returns a ton of errors (import: command
We had some serious issues from the hmaster running out of space on the root
partition. We were getting region server not found errors on the client
which then turned to client errors servers have issues etc.
I ran the hbck command and found 14 inconsistencies. There were files in
hdfs not used
.
On Fri, Jul 1, 2011 at 11:32 AM, Wayne wav...@gmail.com wrote:
We had some serious issues from the hmaster running out of space on the
root partition. We were getting region server not found errors on the
client which then turned to client errors servers have issues etc.
I ran the hbck
We are seeing responseTooLarge for: next... errors in our region server
logs. If I understand correctly is this caused by opening a scanner and the
rows are too big to be returned? If the scan batch size is set to 1 this
tells me these are rows too big to actually read. Is this correct? Why would
We also see a lot of responseTooLarge for: multi. Not sure what this is...
On Tue, Jun 7, 2011 at 9:50 AM, Wayne wav...@gmail.com wrote:
We are seeing responseTooLarge for: next... errors in our region server
logs. If I understand correctly is this caused by opening a scanner and the
rows
I had 25 sec CMF failure this morning...looks like bulk inserts are required
along with possibly weekly/daily scheduled rolling restarts. Do most
production clusters run rolling restarts on a regular basis to give the JVM
a fresh start?
Thanks.
On Thu, Jun 2, 2011 at 1:56 PM, Wayne wav
Are there any recommended methods/scripts to monitor nodes via nagios? It
would be best to have a simple nagios call to check hadoop, hbase,
thrift separately and alarm if one of them is awol (and not have the script
cause damage like I have read with thrift). For example our friendly CMF
issues
I have finally been able to spend enough time to digest/test
all recommendations and get this under control. I wanted to thank Stack,
Jack Levin, and Ted Dunning for their input.
Basically our memory was being pushed to the limit and the JVM does not
like/can not handle this. We are successfully
Our storefileindex was pushing 3g. We used the hfile tool to see that we had
very large keys (50-70 bytes) and small values (5-7 bytes). Jack pointed me
to a great Jira about this: https://issues.apache.org/jira/browse/HBASE-3551
.
We HAD to increase from the default and we picked 256k to reduce
as regions are compacted.
We have enabled the memstore MSLAB option. Not sure it is relevant but we
have a 5G region size and 256m memstore flush size.
Thanks.
On Thu, Jun 2, 2011 at 11:48 AM, Stack st...@duboce.net wrote:
Thanks for writing back to the list Wayne. Hopefully this message
hits
load for long
enough to gain confidence.
Thanks.
On Thu, May 26, 2011 at 3:43 AM, Jack Levin magn...@gmail.com wrote:
Wayne, I think you are hitting fragmentation, how often do you flush?
Can you share memstore flush graphs?
Here is ours:
http://img851.yfrog.com/img851/9814
logs.
http://pastebin.com/kJyJHgQc
Thanks.
On Thu, May 26, 2011 at 9:08 AM, Wayne wav...@gmail.com wrote:
Attached is our memstore size graph...not sure it will make it to the post.
Ours it definitely not as gracefull as yours. You can see where we restarted
last 16 hours ago. We have not had
:
On Thu, May 26, 2011 at 9:00 AM, Wayne wav...@gmail.com wrote:
Looking more closely I can see that we are still
getting Concurrent Mode Failures on some of the nodes but they are only
lasting for 10s so the nodes don't go away. Is this considered normal?
With CMSInitiatingOccupancyFraction=65
On Thu, May 26, 2011 at 1:41 PM, Stack st...@duboce.net wrote:
On Thu, May 26, 2011 at 6:08 AM, Wayne wav...@gmail.com wrote:
I think our problem is the load pattern. Since we use a very controlled q
based method to do work our Python code is relentless in terms of keeping
the pressure up
]
: 7621159K-2503625K(8111744K), 63.3195660 secs]
7798327K-2503625K(8360960K), [CMS Perm : 20128K-20106K(33580K)]
icms_dc=100 , 63.8965450 secs] [Times: user=69.50 sys=0.01, real=63.89
secs]
On Mon, May 23, 2011 at 12:04 PM, Stack st...@duboce.net wrote:
On Mon, May 23, 2011 at 8:42 AM, Wayne wav
://pastebin.com/ca13aMRu
http://pastebin.com/9KfRZFBW
On Wed, May 25, 2011 at 1:42 PM, Todd Lipcon t...@cloudera.com wrote:
Hi Wayne,
Looks like your RAM might be oversubscribed. Could you paste your
hbase-site.xml and hbase-env.sh files? Also looks like you have some
strange GC settings
, 2011 at 11:08 AM, Wayne wav...@gmail.com wrote:
I tried to turn off all special JVM settings we have tried in the past.
Below are link to the requested configs. I will try to find more logs for
the full GC. We just made the switch and on this node it has
only occurred once in the scope
as normal. Cassandra had the exact same problems for us (plus a lot of other
issues), and we all know what is common between the two.
On Wed, May 25, 2011 at 2:39 PM, Ted Dunning tdunn...@maprtech.com wrote:
Wayne,
It should be recognized that your experiences are a bit out of the norm
here
with writes nothing we do seems to stop it with
getting paused very very frequently.
I will look into the zookeeper log location, never looked at those...
Thanks for the help.
On Wed, May 25, 2011 at 3:23 PM, Stack st...@duboce.net wrote:
On Wed, May 25, 2011 at 11:08 AM, Wayne wav...@gmail.com
updates?
On Wed, May 25, 2011 at 2:44 PM, Wayne wav...@gmail.com wrote:
What are your write levels? We are pushing 30-40k writes/sec/node on 10
nodes for 24-36-48-72 hours straight. We have only 4 writers per node so
we
are hardly overwhelming the nodes. Disk utilization runs at 10-20%, load
square shop and you would be a square one in my round one.
On Wed, May 25, 2011 at 5:55 PM, Wayne wav...@gmail.com wrote:
We are not a Java shop, and do not want to become one. I think to push
the
limits and do well with hadoop+hdfs you have to buy into Java and have
deep
skills there. We
I have switched to using the mslab enabled java setting to try to avoid GC
causing nodes to go awol but it almost appears to be worse. Below is the
latest problem with the JVM apparently actually crashing. I am using 0.90.1
with an 8GB heap. Is there a recommended JVM and recommended settings to
In order to reduce the total number of regions we have up'd the max region
size to 5g. This has kept us below 100 regions per node but the side affect
is pauses occurring every 1-2 min under heavy writes to a single region. We
see the too many store files delaying flush up to 90sec warning every
...@maprtech.com wrote:
Do you have the same problem with a more recent JVM?
On Mon, May 23, 2011 at 4:52 AM, Wayne wav...@gmail.com wrote:
I have switched to using the mslab enabled java setting to try to avoid
GC
causing nodes to go awol but it almost appears to be worse. Below is the
latest problem
data nodes, but you know what they say about assumptions...
From: tdunn...@maprtech.com
Date: Mon, 23 May 2011 07:33:05 -0700
Subject: Re: mslab enabled jvm crash
To: user@hbase.apache.org
Do you have the same problem with a more recent JVM?
On Mon, May 23, 2011 at 4:52 AM, Wayne
:
u17 was release a year and a half ago. Latest is u25 (we run u24).
What kind of 'crash' are you seeing? What is your OS?
St.Ack
On Mon, May 23, 2011 at 8:19 AM, Wayne wav...@gmail.com wrote:
Zookeeper is not on the same nodes...and yes we could up to 120 seconds
but
then we are back
is a region server log snippet for this occurring 2x in less than a 2
minute period.
http://pastebin.com/CxAQSXTt
On Mon, May 23, 2011 at 11:33 AM, Stack st...@duboce.net wrote:
On Mon, May 23, 2011 at 6:40 AM, Wayne wav...@gmail.com wrote:
In order to reduce the total number of regions we
tables as you like. I
do not believe there a cost to having more tables.
St.Ack
On Wed, May 18, 2011 at 5:54 AM, Wayne wav...@gmail.com wrote:
How many tables can a cluster realistically handle or how many
tables/node
can be supported? I am looking for a realistic idea of whether a 10 node
How many tables can a cluster realistically handle or how many tables/node
can be supported? I am looking for a realistic idea of whether a 10 node
cluster can support 100 or even 500 tables. I realize it is recommended to
have a few tables at most (and to use the row key to add everything to one
What JVM is recommended for the new memstore allocator? We swtiched from u23
back to u17 which helped a lot. Is this optimized for a specific JVM or does
it not matter?
On Fri, Feb 18, 2011 at 5:46 PM, Todd Lipcon t...@cloudera.com wrote:
On Fri, Feb 18, 2011 at 12:10 PM, Jean-Daniel Cryans
, 2011 at 2:15 AM, M. C. Srivas mcsri...@gmail.com wrote:
I was reading this thread with interest. Here's my $.02
On Fri, Dec 17, 2010 at 12:29 PM, Wayne wav...@gmail.com wrote:
Sorry, I am sure my questions were far too broad to answer.
Let me *try* to ask more specific questions. Assuming
Compaction Queue size usually explains a lot. That along with load and disk
utilization are what I use the most. I am definitely interested in what
others use, especially metrics that indicate early for problems.
Thanks.
On Wed, Feb 9, 2011 at 1:42 PM, Tim Sell trs...@gmail.com wrote:
What do
it be that your region servers
are creating them faster than that?
In any case, it's safe to delete them but not the folder itself. Also
please
open a jira and assign it to me.
J-D
On Jan 29, 2011 5:22 PM, Wayne wav...@gmail.com wrote:
The current log folder in hdfs (.logs) seems to always
I know there were some changes in .90 in terms of how region balancing
occurs. Is there a resource somewhere that describes the options for the
configuration? Per Jonathan Gray's recommendation we are trying to keep our
region count down to 100 per region server (we are up to 5gb region size).
2, 2011 at 7:51 PM, Wayne wav...@gmail.com wrote:
I know there were some changes in .90 in terms of how region balancing
occurs. Is there a resource somewhere that describes the options for the
configuration? Per Jonathan Gray's recommendation we are trying to keep
our
region count down
hbase.master.startup.retainassign=false works like a charm. After a restart
all tables are scattered across all region servers.
Thanks!
On Wed, Feb 2, 2011 at 4:06 PM, Stack st...@duboce.net wrote:
On Wed, Feb 2, 2011 at 8:41 PM, Wayne wav...@gmail.com wrote:
The regions counts are the same
After doing many tests (10k serialized scans) we see that on average opening
the scanner takes 2/3 of the read time if the read is fresh
(scannerOpenWithStop=~35ms, scannerGetList=~10ms). The second time around (1
minute later) we assume the region cache is hot and the open scanner is
much faster
vs cold you are seeing below.
-ryan
On Mon, Jan 31, 2011 at 1:38 PM, Wayne wav...@gmail.com wrote:
After doing many tests (10k serialized scans) we see that on average
opening
the scanner takes 2/3 of the read time if the read is fresh
(scannerOpenWithStop=~35ms, scannerGetList=~10ms
On Mon, Jan 31, 2011 at 4:54 PM, Stack st...@duboce.net wrote:
On Mon, Jan 31, 2011 at 1:38 PM, Wayne wav...@gmail.com wrote:
After doing many tests (10k serialized scans) we see that on average
opening
the scanner takes 2/3 of the read time if the read is fresh
(scannerOpenWithStop=~35ms
in a LRU
manner, and things would get slow again.
Does this make sense to you?
On Mon, Jan 31, 2011 at 1:50 PM, Wayne wav...@gmail.com wrote:
We have heavy writes always going on so there is always memory pressure.
If the open scanner reads the first block maybe that explains the 8ms
the first
block it needs. This is done during the 'openScanner' call, and would
explain the latency you are seeing in openScanner.
-ryan
On Mon, Jan 31, 2011 at 2:17 PM, Wayne wav...@gmail.com wrote:
I assume BLOCKCACHE = 'false' would turn this off? We have turned off
cache
on all
The current log folder in hdfs (.logs) seems to always keep to 32 log files
max per region server or the last hour. It is the .oldlogs folder that is
growing crazy large. I added the setting for hbase.master.logcleaner.ttl and
switched it from 7 days to 2 days and restarted yesterday and no
How is the .oldlogs folder cleaned up? My cluster size kept going up and I
looked closely and realized that 91% of the space was going to .oldlogs that
do not appear to be archived. This adds up to 12.5TB with rf=3 in the 4 days
we have been up with .90. How can this be configured to be cleaned
We have got .90 up and running well, but again after 24 hours of loading a
node went down. Under it all I assume it is a GC issue, but the GC logging
rolls every 60 minutes so I can never see logs from 5 hours ago (working
on getting Scribe up to solve that). Most of our issues are a node being
during balancing and splits.
Wayne, have you confirmed in your RegionServer logs that the pauses
are
associated with splits or region movement, and that you are not seeing
the
blocking store files issue?
JG
-Original Message-
From: c...@tarnas.org [mailto:c
running hbase 0.20.6, I found none.
Both use cdh3b2 hadoop.
On Thu, Jan 27, 2011 at 6:48 AM, Wayne wav...@gmail.com wrote:
We have got .90 up and running well, but again after 24 hours of loading
a
node went down. Under it all I assume it is a GC issue, but the GC
logging
rolls every 60
:
On Thu, Jan 27, 2011 at 6:48 AM, Wayne wav...@gmail.com wrote:
We have got .90 up and running well, but again after 24 hours of loading
a
node went down. Under it all I assume it is a GC issue, but the GC
logging
rolls every 60 minutes so I can never see logs from 5 hours ago
(working
We tried to upgrade to .90 and got 2x the nodes listed and saw none of our
old regions showing up in the counts. We assumed the upgrade was not easy
so we just re-formated the HDFS thinking it would fix everything and still
see the same problem. Any suggestions? The duplicate region servers listed
Is reverse dns a requirement with .90? It was not with .89.xxx
On Mon, Jan 24, 2011 at 3:17 PM, Wayne wav...@gmail.com wrote:
We tried to upgrade to .90 and got 2x the nodes listed and saw none of our
old regions showing up in the counts. We assumed the upgrade was not easy
so we just re
PM, Wayne wav...@gmail.com wrote:
Is reverse dns a requirement with .90? It was not with .89.xxx
On Mon, Jan 24, 2011 at 3:17 PM, Wayne wav...@gmail.com wrote:
We tried to upgrade to .90 and got 2x the nodes listed and saw none of
our
old regions showing up in the counts. We
seemed to have any
affect)...
Thanks.
On Fri, Jan 21, 2011 at 1:47 PM, Stack st...@duboce.net wrote:
On Fri, Jan 21, 2011 at 4:51 AM, Wayne wav...@gmail.com wrote:
After several hours I have figured out how to get the Disable command to
work and how to delete manually, but in the process
What is the difference between .90 and .90_master_rewrite?
Thanks.
On Fri, Jan 21, 2011 at 2:29 PM, Lars George lars.geo...@gmail.com wrote:
Hi Wayne,
0.90.0 is out. Get it while it's hot from the HBase home page.
Lars
On Jan 21, 2011, at 20:22, Wayne wav...@gmail.com wrote:
I
I need to delete some tables and I am not sure the best way to do it. The
shell does not work. The disable command says it runs ok but every time I
run drop or truncate I get an exception that says the table is not
disabled. The UI shows it as disabled but truncate/drop still do not work.
I have
Not everyone is looking for a distributed memcache. Many of us are looking
for a database that scales up and out, and for that there is only one
choice. HBase does auto partitioning with regions; this is the genius of the
original bigtable design. Regions are logical units small enough to be fast
We have not found any smoking gun here. Most likely these are region splits
on a quickly growing/hot region that all clients get caught waiting for.
On Thu, Jan 13, 2011 at 7:49 AM, Wayne wav...@gmail.com wrote:
Thank you for the lead! We will definitely look closer at the OS logs.
On Thu
Thank you for the lead! We will definitely look closer at the OS logs.
On Thu, Jan 13, 2011 at 6:59 AM, Tatsuya Kawano tatsuya6...@gmail.comwrote:
Hi Wayne,
We are seeing some TCP Resets on all nodes at the same time, and
sometimes
quite a lot of them.
Have you checked this article
loads. We see RS activity drop around memstore
flushes, compactions and especially splits.
Friso
On 11 jan 2011, at 23:57, Wayne wrote:
What is shared across all nodes that could stop everything? Originally I
suspected the node with the .META. table and GC pauses but could never
find
Added: https://issues.apache.org/jira/browse/HBASE-3438.
On Wed, Jan 12, 2011 at 11:40 AM, Wayne wav...@gmail.com wrote:
We are using 0.89.20100924, r1001068
We are seeing see it during heavy write load (which is all the time), but
yesterday we had read load as well as write load and saw
,
St.Ack
On Wed, Jan 12, 2011 at 9:03 AM, Wayne wav...@gmail.com wrote:
Added: https://issues.apache.org/jira/browse/HBASE-3438.
On Wed, Jan 12, 2011 at 11:40 AM, Wayne wav...@gmail.com wrote:
We are using 0.89.20100924, r1001068
We are seeing see it during heavy write load (which
haven't seen the same
problems after down rev'ing to jdk1.6u16.
-brent
On Mon, Jan 10, 2011 at 12:59 PM, Wayne wav...@gmail.com wrote:
We had a node last night go awol and got stuck in permanent 50% CPU wait
time. The node also steadily shot up the load to 400 before we saw it and
had
this also with evil disk controllers on the edge of dying.
On Tue, Jan 11, 2011 at 12:10 PM, Wayne wav...@gmail.com wrote:
Thanks a lot for the heads up on this. We have only seen this once, but
if
we start seeing it more we will definitely try to go back to a previous
version. We
We have very frequent cluster wide pauses that stop all reads and writes
for seconds. We are constantly loading data to this cluster of 10 nodes.
These pauses can happen as frequently as every minute but sometimes are not
seen for 15+ minutes. Basically watching the Region server list with
We are seeing a lot of warnings and errors in the HDFS logs (examples
below). I am looking for any help or recommendations anyone can provide. It
almost looks like compaction/splits occur and errors occur while looking for
the old data. Could this be true? Are these warnings and errors normal?
Thank you for the help. Below are a few more questions.
On Mon, Jan 10, 2011 at 1:41 PM, Jean-Daniel Cryans jdcry...@apache.orgwrote:
Inline.
*Region/Meta Cache
*
Often times the region list is not hot and thrift has to talk to the
meta
table. We have 6k+ regions and growing quickly
you build a history. Is hbase trying
to use files its already deleted elsewhere?
St.Ack
On Mon, Jan 10, 2011 at 9:20 AM, Wayne wav...@gmail.com wrote:
We are seeing a lot of warnings and errors in the HDFS logs (examples
below). I am looking for any help or recommendations anyone can
We had a node last night go awol and got stuck in permanent 50% CPU wait
time. The node also steadily shot up the load to 400 before we saw it and
had to hard reboot. Besides that all other ganglia metrics flat-lined. Is
this some sort of bizarre kernal problem? We are using xfs with std
settings.
of hbase are you on?
Thanks Wayne,
St.Ack
On Mon, Jan 10, 2011 at 11:34 AM, Wayne wav...@gmail.com wrote:
Yes it appears blocks are being referenced that have been deleted hours
before. The logs below are based on tracing that history.
On Mon, Jan 10, 2011 at 2:31 PM, Stack st
I am trying to understand exactly what an HBase read is doing through Thrift
(python) so that we can know what to change to improve our performance (read
latency). We have turned off all cache to make testing consistent.
*Region/Meta Cache
*
Often times the region list is not hot and thrift has
.
On Fri, Jan 7, 2011 at 1:09 AM, Stack st...@duboce.net wrote:
On Thu, Jan 6, 2011 at 5:46 AM, Wayne wav...@gmail.com wrote:
I had another node go down last night. No load at the time, it just seems
it
had issues and shut itself down. Any help would be greatly appreciated.
Why
would
I see the message below as often as every few minutes. It appears to occur
after compaction begins. Is this normal? Is it an indication of bigger
issues? This is after having upped our xceivers.
WARN org.apache.hadoop.hbase.regionserver.Store: Not in
I had another node go down last night. No load at the time, it just seems it
had issues and shut itself down. Any help would be greatly appreciated. Why
would the file system go away? Is this an hadoop problem or a hardware
problem or ??
Here is a sampling of the logs. Please let me know what
, 2011 at 12:13 PM, Wayne wav...@gmail.com wrote:
It was carrying ~9k writes/sec and has been for the last 24+ hours. There
are 500+ regions on that node. I could not find the heap dump (location?)
but we do have some errant big rows that have crashed before. When we query
those big rows thrift
of the hprof is usually where the program was launched from
(check $HBASE_HOME dir).
St.Ack
On Wed, Jan 5, 2011 at 11:24 AM, Wayne wav...@gmail.com wrote:
Pretty sure this is compaction. The same node OOME again along with
another
node after starting compaction. Like cass* .6 I guess hbase can
Having worked with the other java/thrift based nosql solution we have been
using Thrift Accelerated Protocol and it works great. It is very fast and we
have seen 3-4x performance improvement on some read operations (wide rows).
We have never seen this advertised or referrenced with any hbase
After heavily loading a 10 node cluster for 3-4 days I got a concurrent mode
failure of 53 seconds followed by a NodeIsDeadException which caused the
node to be shut down. Is there is a timeout that can be increased so this
does not occur in the future? From my experience with Cassandra Concurrent
, Wayne wav...@gmail.com wrote:
Any help or suggestions would be appreciated. Parnew was getting large
and
taking too long ( 100ms) so I will try to limit the size with the
suggestion from the performance tuning page (-XX:NewSize=6m
-XX:MaxNewSize=6m).
The CMS concurrent mode failure
-XX:+CMSParallelRemarkEnabled
-XX:SurvivorRatio=8
-XX:NewRatio=3
-XX:MaxTenuringThreshold=1
On Mon, Jan 3, 2011 at 5:05 PM, Stack st...@duboce.net wrote:
On Mon, Jan 3, 2011 at 12:50 PM, Wayne wav...@gmail.com wrote:
We have an 8GB heap. What should newsize be? I
We are finding that the node that is responsible for the .META. table is
going in GC storms causing the entire cluster to go AWOL until it recovers.
Isn't the master supposed to serve up the .META. table? Is it possible to
Pin this table somewhere that only handles this? Our master server and
on).
BTW, are you using CMS on a 8GB Heapsize JVM and experiencing a 4 mins
pause? That sounds a lot.
On Thu, Dec 30, 2010 at 1:51 PM, Wayne wav...@gmail.com wrote:
Lesson learned...restart thrift servers *after* restarting hadoop+hbase.
On Thu, Dec 30, 2010 at 3:39 PM, Wayne wav
)
memstore.flush.size = 268435456 (256MB = 4x default)
hregion.max.filesize = 1073741824 (1GB = 4x default)
*Table*
alter 'xxx', METHOD = 'table_att', DEFERRED_LOG_FLUSH = true
On Wed, Dec 29, 2010 at 12:55 AM, Stack st...@duboce.net wrote:
On Mon, Dec 27, 2010 at 11:47 AM, Wayne wav...@gmail.com
and configuration.
How many concurrent writer processes are you running?
Thanks,
Michael
On Thu, Dec 30, 2010 at 8:51 AM, Wayne wav...@gmail.com wrote:
We finally got our cluster up and running and write performance looks
very
good. We are getting sustained 8-10k writes/sec/node on a 10 node
Lesson learned...restart thrift servers *after* restarting hadoop+hbase.
On Thu, Dec 30, 2010 at 3:39 PM, Wayne wav...@gmail.com wrote:
We have restarted with lzop compression, and now I am seeing some really
long and frequent stop the world pauses of the entire cluster. The load
requests
, Dec 27, 2010 at 1:49 PM, Stack st...@duboce.net wrote:
On Fri, Dec 24, 2010 at 5:09 AM, Wayne wav...@gmail.com wrote:
We are in the process of evaluating hbase in an effort to switch from a
different nosql solution. Performance is of course an important part of
our
evaluation. We are a python
On Fri, Dec 24, 2010 at 5:09 AM, Wayne wav...@gmail.com wrote:
We are in the process of evaluating hbase in an effort to switch from a
different nosql solution. Performance is of course an important part of
our
evaluation. We are a python shop and we are very worried that we can not
get
any
will use HDFS
pread instead of seek/read. For this application, you absolutely must be
using pread.
Good luck. I'm interested in seeing how you can get HBase to perform, we
are here to help if you have any issues.
JG
-Original Message-
From: Wayne [mailto:wav...@gmail.com]
Sent
much data is really too much on an hbase data node?
Any help or advice would be greatly appreciated.
Thanks
Wayne
PM, Jean-Daniel Cryans jdcry...@apache.orgwrote:
Hi Wayne,
This question has such a large scope but is applicable to such a tiny
subset of workloads (eg yours) that fielding all the questions in
details would probably end up just wasting everyone's cycles. So first
I'd like to clear up some
amount of activity in this area (optimizing HDFS for
random reads) and lots of good ideas. HDFS-347 would probably help
tremendously for this kind of high random read rate, bypassing the DN
completely.
JG
-Original Message-
From: Wayne [mailto:wav...@gmail.com]
Sent: Friday
into
the 100s of tables we might very well set up totally different clusters to
handle different groups of customers.
Thanks.
On Thu, Jul 15, 2010 at 11:47 PM, Gary Helmling ghelml...@gmail.com wrote:
On Wed, Jul 14, 2010 at 1:25 AM, Wayne wav...@gmail.com wrote:
1) How can hbase be configured
99 matches
Mail list logo