Hi Harsh J,
I'm not using WAL in my writes.
Is there still a log rolling ?
ב-Jun 17, 2012, בשעה 7:40, Harsh J כתב/ה:
> Amit,
>
> Your values for HLog block size (hbase.regionserver.hlog.blocksize,
> default is the HDFS default block size (64 MB unless you've raised it
> properly), too low un
Amit,
Your values for HLog block size (hbase.regionserver.hlog.blocksize,
default is the HDFS default block size (64 MB unless you've raised it
properly), too low unless you also have HLog compression) and the
factor of max-hlogs-to-keep (hbase.regionserver.maxlogs, default 32
files) can easily ca
Just to add from my experiences:
Yes hotspotting is bad, but so are devops headaches. A reasonable machine
can handle 3-4000 puts a second with ease, and a simple timerange scan can
give you the records you need. I have my doubts you will be hitting these
amounts anytime soon. A simple setup will
Jean-Marc,
You indicated that you didn't want to do full table scans when you want to find
out which files hadn't been touched since X time has past.
(X could be months, weeks, days, hours, etc ...)
So here's the thing.
First, I am not convinced that you will have hot spotting.
Second, you e
Let's imagine the timestamp is "123456789".
If I salt it with later from 'a' to 'z' them it will always be split
between few RegionServers. I will have like "t123456789". The issue is
that I will have to do 26 queries to be able to find all the entries.
I will need to query from A0 to Axxx
You can't salt the key in the second table.
By salting the key, you lose the ability to do range scans, which is what you
want to do.
Sent from a remote device. Please excuse any typos...
Mike Segel
On Jun 16, 2012, at 6:22 AM, Jean-Marc Spaggiari
wrote:
> Thanks all for your comments and
Thanks Doug, I read the regions section from the book like you recommended
but I still have some questions left.
When running a massive write job, the regionserver log show the memsize
that is flushed. The problem is that most of the time the memsize is either
much smaller then the memstore.flush.
Hi,
I have done this at a customer site to overcome the 0.90.x slow WAL
performance. With one RS per DN we bottlenecked, with 5-7 RS per DN we were
able to hit the target rate.
Please note that we did this in lieu of the proper built-in options like WAL
compression, multiple WAL, or n-way wri
Thanks all for your comments and suggestions. Regarding the
hotspotting I will try to salt the key in the 2nd table and see the
results.
Yesterday I finished to install my 4 servers cluster with old machine.
It's slow, but it's working. So I will do some testing.
You are recommending to modify th
Hi Doug,
You're right. I missed it :( I received Lars' book yesterday, so I
will read a lot more before my next question ;)
JM
2012/6/13, Doug Meil :
>
> Just wanted to point out that is also discussed under the autoFlush entry
> in this chapter..
>
> http://hbase.apache.org/book.html#perf.writi
Stack,
I have no issues with HBase, the question is purely theoretical.
> So, you intend doubling the datanode instances per machine too?
Everything else would not make sense to me, or what do you think?
Thanks for your feedback!
Regards,
Em
Am 16.06.2012 07:12, schrieb Stack:
> On Fri, Jun 15
11 matches
Mail list logo