Re: [ANNOUNCE] Apache HBase 1.3.1 is now available for download

2017-04-21 Thread lars hofhansl
Hi Mikhail, I don't see a 1.3.1 tag, yet. Thanks. -- Lars From: Mikhail Antonov To: "user@hbase.apache.org" ; "d...@hbase.apache.org" Sent: Friday, April 21, 2017 1:02 AM Subject: [ANNOUNCE] Apache HBase 1.3.1 is

Re: Regionservers going down during compaction

2015-07-17 Thread lars hofhansl
Yep. Agreed. Local NUMA zones are usually not what you want for something like HBase. -- Lars From: Vladimir Rodionov vladrodio...@gmail.com To: user@hbase.apache.org user@hbase.apache.org; lars hofhansl la...@apache.org Sent: Thursday, July 16, 2015 12:01 PM Subject: Re: Regionservers

Re: Regionservers going down during compaction

2015-07-15 Thread lars hofhansl
We're running with fine 31g heap (31 to be able to make use of compressed oops) after a lot of tuning. Maybe your pattern is different...? Or... Since it is ParNew on a 1GB only small gen taking that much time... Maybe you ran into this: http://www.evanjones.ca/jvm-mmap-pause.html? -- Lars

Re: [DISCUSS] correcting abusive behavior on mailing lists was (Re: [DISCUSS] Multi-Cluster HBase Client)

2015-06-30 Thread lars hofhansl
Moderating is better than outright banning, I think.While Micheal is sometimes infuriating, he's also funny and smart. Can we have a group of moderators? I'd volunteer, but not if I'm the only one. -- Lars From: Stack st...@duboce.net To: Hbase-User user@hbase.apache.org Cc:

Re: [DISCUSS] correcting abusive behavior on mailing lists was (Re: [DISCUSS] Multi-Cluster HBase Client)

2015-06-30 Thread lars hofhansl
the interaction with Sean. Are you going to defend that as within the bounds of acceptable behavior? Is this the first time that has happened? On Jun 30, 2015, at 5:58 PM, lars hofhansl la...@apache.org wrote: Moderating is better than outright banning, I think.While Micheal is sometimes

Re: How to make the client fast fail

2015-06-16 Thread lars hofhansl
Filed https://issues.apache.org/jira/browse/HBASE-13919, please chime in there. Thanks. -- Lars  From: lars hofhansl la...@apache.org To: mukund murrali mukundmurra...@gmail.com; user@hbase.apache.org user@hbase.apache.org Sent: Tuesday, June 16, 2015 12:37 PM Subject: Re: How to make

Re: How to make the client fast fail

2015-06-16 Thread lars hofhansl
there. Thanks for pointing this issue out. -- Lars From: mukund murrali mukundmurra...@gmail.com To: user@hbase.apache.org; lars hofhansl la...@apache.org Sent: Tuesday, June 16, 2015 12:21 AM Subject: Re: How to make the client fast fail We are using HBase - 1.0.0. Yes we have went through

Re: How to make the client fast fail

2015-06-16 Thread lars hofhansl
Please always tell us which version of HBase you are using. We have fixed a lot of issues in this area over time.Here's an _old_ blog post I wrote about this: http://hadoop-hbase.blogspot.com/2012/09/hbase-client-timeouts.html Using yet more threads to monitor timeouts of another thread is a

Re: Hbase vs Cassandra

2015-06-06 Thread lars hofhansl
HBase is a distributed, consistent, sorted key value store. The sorted bit allows for range scans in addition to the point gets that all K/V stores support. Nothing more, nothing less. It happens to store its data in HDFS by default, and we provide convenient input and output formats for map

Re: Hbase vs Cassandra

2015-05-31 Thread lars hofhansl
You really have to try out both if you want to be sure. The fundamental differences that come to mind are: * HBase is always consistent. Machine outages lead to inability to read or write data on that machine. With Cassandra you can always write. * Cassandra defaults to a random partitioner, so

Re: Optimizing compactions on super-low-cost HW

2015-05-24 Thread lars hofhansl
Yeah, all you can do is drive your write amplification down. As Stack said: - Increase hbase.hstore.compactionThreshold, and hbase.hstore.blockingStoreFiles. It'll hurt read, but in your case read is already significantly hurt when compactions happen. - Absolutely set

Re: hbase 0.94.7 snapshot problem

2015-05-17 Thread lars hofhansl
The latest version of 0.94 is 0.94.27. I doubt you'll get much help for 0.94.7 here (it's two years and 20! releases ago) Note that you can upgrade from 0.94.7 to 0.94.27 without down time (with a rolling upgrade), but you'll have to build it from source yourself. -- Lars From: Neutron

Re: Test failure for HBase 0.94.18 against hadoop 2.2.0

2015-05-08 Thread lars hofhansl
Hi Kuldeep, could you file a jira with the error you see? Some folks are still using 0.94, so we should see whether we can fix it. -- Lars - Original Message - From: Kuldeep Bora kuldeep.b...@gmail.com To: user@hbase.apache.org Cc: Sent: Thursday, May 7, 2015 7:59 AM Subject: Re:

Re: HBase memstore flush configurations

2015-05-08 Thread lars hofhansl
Do you have some log messages from the flushes? Maybe you're flushing due to hbase.regionserver.flush.per.changes (number of edit before we flush). We also flush the memstore when the WAL is rolled, which also happens once/hour. Note hbase.regionserver.optionalcacheflushinterval mostly exists

Re: Right value for hbase.rpc.timeout

2015-05-04 Thread lars hofhansl
Why do you have regions that large? The 0.92 default was 1G (admittedly, that was much too small), the 0.98 default is 10G, which should be good in most cases.Mappers divide their work based on regions, so very large region lead to more uneven execution time, unless you truly have a a very

Re: RegionServer not crash but not response!!

2015-05-04 Thread lars hofhansl
Do have a stackdump? You can get that with: jstack pid -- Lars From: louis.hust louis.h...@gmail.com To: user@hbase.apache.org Cc: Ted Yu yuzhih...@gmail.com Sent: Monday, May 4, 2015 6:49 PM Subject: Re: RegionServer not crash but not response!! Can anyone help me? On May 4, 2015,

Re: Rowkey design question

2015-04-12 Thread lars hofhansl
After the fun interlude (sorry about that) let me get back to the issue. There a multiple consideration: 1. row vs column. If in doubt err on the side of more rows. Only use many columns in a row when you need transaction over the data in the columns. 2. Value sizes. HBase is good at dealing

Re: introducing nodes w/ more storage

2015-04-02 Thread lars hofhansl
of those. -- Lars From: Kevin O'dell kevin.od...@cloudera.com To: user@hbase.apache.org user@hbase.apache.org Cc: lars hofhansl la...@apache.org Sent: Thursday, April 2, 2015 5:41 AM Subject: Re: introducing nodes w/ more storage Hi Mike,   Sorry for the delay here.   How does the HDFS

Re: [ANNOUNCE] Sean Busbey joins the Apache HBase PMC

2015-03-26 Thread lars hofhansl
Yeah! :) From: Andrew Purtell apurt...@apache.org To: user@hbase.apache.org user@hbase.apache.org; d...@hbase.apache.org d...@hbase.apache.org Sent: Thursday, March 26, 2015 10:26 AM Subject: [ANNOUNCE] Sean Busbey joins the Apache HBase PMC On behalf of the Apache HBase PMC Im

Re: master consumes large amount of CPU for days

2015-03-26 Thread lars hofhansl
Hi Ted, Yes, it is safe to bounce the HMaster without taking the region servers down. Are there any regions in transition (would be shown on the master's page)? All of the threads involved with - org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(org.apache.hadoop.fs.Path) @bci=14,

Re: Hbase PrefixFilter Problem

2015-03-25 Thread lars hofhansl
Do you have a lot of data? This means the scanner took too long on the server and the client essentially timed out. PrefixFilter does not actually skip ahead to the first row in question. You always have to also set the scanner's startRow to the first row you care about (i.e. the prefix).In

[ANNOUNCE] HBase 0.94.27 is available for download

2015-03-25 Thread lars hofhansl
The HBase Team is pleased to announce the immediate release of HBase 0.94.27. Download it from your favorite Apache mirror [1]. This release has also been pushed to Apache's maven repository. All previous 0.92.x and 0.94.x releases can be upgraded to 0.97.27 via a rolling upgrade without

Re: Root Region Error with hbase 0.98.4

2015-03-22 Thread lars hofhansl
Did you wipe everything before you ran your code against 0.94. 0.96 and later no longer the -ROOT- region. If you start from scratch with 0.94 it will create the necessary ZK state for you (including the -ROOT- node). You need to wipe _everything_ HDFS, and the Zookeeper state (rmr the /hbase

Re: introducing nodes w/ more storage

2015-03-22 Thread lars hofhansl
Seems that it should not be too hard to add that to the stochastic load balancer. We could add a spaceCost or something. - Original Message - From: Jean-Marc Spaggiari jean-m...@spaggiari.org To: user user@hbase.apache.org Cc: Development developm...@mentacapital.com Sent: Thursday,

Re: Poll: HBase usage by HBase version

2015-03-18 Thread lars hofhansl
Realistically this should be weighed by number of machines.If you run a small 5 node cluster, sure, you can upgrade easily. But your vote does not count as much as somebody who's running 1000 machines. -- Lars From: Otis Gospodnetic otis.gospodne...@gmail.com To: user@hbase.apache.org

Re: Re: Why can the capacity of a table with TTL grow continuously?

2015-03-10 Thread lars hofhansl
I agree. This looks as it should.You also mentioned that you have compactions enabled. If you force a major_compact through the HBase shell, will some space be reclaimed? (careful, this will compact everything in the table, which can put some load on the net/disks).Lastly, did you stop

Re: RegionServer - Insufficient Memory and Cascading Errors

2015-03-09 Thread lars hofhansl
Sub 1GB heaps are not useful for anything but cursory functional testing with a few rows. It does not give the GC enough leeway to deal with the per RPC garbage that HBase produces. But this does not translate to similar behavior with more data and larger heaps. If I can plug my own blog post

Re: [ANNOUNCE] Apache HBase 1.0.0 is now available for download

2015-02-24 Thread lars hofhansl
Hip hip, Hooray!! From: Enis Söztutar e...@apache.org To: hbase-user user@hbase.apache.org; d...@hbase.apache.org d...@hbase.apache.org Sent: Tuesday, February 24, 2015 12:30 AM Subject: [ANNOUNCE] Apache HBase 1.0.0 is now available for download The HBase Team is pleased to

Re: Streaming data to htable

2015-02-14 Thread lars hofhansl
That's pretty cool. Have you documented somewhere how exactly you do that (a blog post or something)? That'd be useful for other folks to know. From: Geovanie Marquez geovanie.marq...@gmail.com To: user@hbase.apache.org user@hbase.apache.org Sent: Friday, February 13, 2015 12:14 PM

Re: Determining regions with low HDFS locality index

2014-12-26 Thread lars hofhansl
There should be logic that attempts to restore the regions on the region servers that had them last. Note that the master can only assign regions to region server that have reported in. For that reason the master waits a bit (4.5s by default) for region servers to report in after a master start

0.94 going forward

2014-12-15 Thread lars hofhansl
Over the past few months the rate of the change into 0.94 has slowed significantly. 0.94.25 was released on Nov 15th, and since then we had only 4 changes. This could mean two things: (1) 0.94 is very stable now or (2) nobody is using it (at least nobody is contributing to it anymore). If

Re: HConnectionManager leaks with zookeeper conection oo many connections from /my.tomcat.server.com - max is 60

2014-12-15 Thread lars hofhansl
Excellent! Should be quite a bit faster too. -- Lars From: Serega Sheypak serega.shey...@gmail.com To: user user@hbase.apache.org Cc: lars hofhansl la...@apache.org Sent: Monday, December 15, 2014 5:57 AM Subject: Re: HConnectionManager leaks with zookeeper conection oo many

Re: 0.94 going forward

2014-12-15 Thread lars hofhansl
! On Mon, Dec 15, 2014 at 1:53 PM, lars hofhansl la...@apache.org wrote: Over the past few months the rate of the change into 0.94 has slowed significantly. 0.94.25 was released on Nov 15th, and since then we had only 4 changes. This could mean two

Re: 0.94 going forward

2014-12-15 Thread lars hofhansl
0.94 i dont believe nobody is using it. for our clients the majority is on 0.94 (versus 0.96 and up). so i am going with 1), its very stable! On Mon, Dec 15, 2014 at 1:53 PM, lars hofhansl la...@apache.org wrote: Over the past few months the rate of the change

Re: HConnectionManager leaks with zookeeper conection oo many connections from /my.tomcat.server.com - max is 60

2014-12-13 Thread lars hofhansl
Note also that the createConnection part is somewhat expensive (creates a new thread pool for use with Puts, also does a ZK lookup, etc).If possible create the connection ahead of time and only get/close an HTable per request/thread. -- Lars From: Serega Sheypak serega.shey...@gmail.com

Re: What companies are using HBase to serve a customer-facing product?

2014-12-06 Thread lars hofhansl
For expected latency, read this: http://hadoop-hbase.blogspot.com/2014/08/hbase-client-response-times.htmlFor cluster/machine sizing this might be helpful: http://hadoop-hbase.blogspot.com/2013/01/hbase-region-server-memory-sizing.html  Disclaimer: I wrote these two posts. -- Lars From:

Re: client timeout

2014-12-04 Thread lars hofhansl
Bad disk or network? Anything in the logs (HBase, HDFS, and System logs)? HBase 0.94, still?The easiest way to just kill the region servers, the others will pick up the regions. -- Lars From: Ted Tuttle t...@mentacapital.com To: user@hbase.apache.org user@hbase.apache.org Cc: Development

Re: Is there anyway I can list out history of transitions made by a region?

2014-11-30 Thread lars hofhansl
You can try Hannibal: https://github.com/sentric/hannibal From: gomes sankarm...@gmail.com To: user@hbase.apache.org Sent: Sunday, November 30, 2014 7:35 PM Subject: Is there anyway I can list out history of transitions made by a region? I want to figure out

Re: Can't enable table after rebooting hbase

2014-11-24 Thread lars hofhansl
Please tell us what happened. Exceptions, error messages, anything in the logs, anything on the overview pages, etc, etc? -- Lars From: guxiaobo1982 guxiaobo1...@qq.com To: user user@hbase.apache.org Sent: Monday, November 24, 2014 7:15 PM Subject: Can't enable table after rebooting

Re: [ANNOUNCE] HBase 0.94.25 is available for download

2014-11-20 Thread lars hofhansl
And now that 0.98.8 is out... 0.94.25 also contains the fix for HBASE-12536. https://issues.apache.org/jira/browse/HBASE-12536 Thanks. -- Lars From: lars hofhansl la...@apache.org To: Dev d...@hbase.apache.org; User user@hbase.apache.org Sent: Saturday, November 15, 2014 4:52 PM Subject

Re: how to explain read/write performance change after modifying the hfile.block.cache.size?

2014-11-20 Thread lars hofhansl
That would explain it if memstores are flushed due to global memory pressure. But cache and memstore size are (unfortunately) configured independently. The memstore heap portion would be 40% (by default) in either case.So this is a bit curious still. Ming, can you tell us more details?- RAM on

[ANNOUNCE] HBase 0.94.25 is available for download

2014-11-15 Thread lars hofhansl
The HBase Team is pleased to announce the immediate release of HBase 0.94.25. Download it from your favorite Apache mirror [1]. This release has also been pushed to Apache's maven repository. All previous 0.92.x and 0.94.x releases can be upgraded to 0.94.24 via a rolling upgrade without

Re: Region split during mapreduce

2014-11-01 Thread lars hofhansl
I do not believe that to be true. HBase only uses Region boundaries to identify useful scan ranges during the setup of the job. These ranges will work regardless of whether the number of regions increases later or not. The worst case is that a single mapper might be scanning multiple regions

Re: Upgrading a coprocessor

2014-10-28 Thread lars hofhansl
A rolling restart across all region server would reload all the latest classes. That is currently the best option, IMHO. -- Lars From: Mike Axiak m...@axiak.net To: user@hbase.apache.org Sent: Tuesday, October 28, 2014 7:03 PM Subject: Re: Upgrading a coprocessor Because of this

Re: Hbase unstable, master and regionservers crashes every night at ~ 00:07 UTC

2014-10-28 Thread lars hofhansl
We had an issue similar to this for a while. Every Sunday between 1am and 6am we'd experience ZK timeouts.Turned out it was the weekly RAID check across all our boxes that caused ZK to timeout. -- Lars From: Ron van der Vegt ron.van.der.v...@openindex.io To: user@hbase.apache.org Sent:

Re: HBase read performance

2014-10-13 Thread lars hofhansl
the period of reverse scan would help. Cheers On Thu, Oct 2, 2014 at 4:52 PM, lars hofhansl la...@apache.org wrote: You might have the data in the OS buffer

Re: hbase row locking

2014-10-11 Thread lars hofhansl
What are you trying to prove/observe, Abhishek? HBase does not hold the row lock while the WAL is sync'ed (see my previous response), so network settings would have no bearing on how long the row locks are being held.See HBASE-4528. You'd to go back to an 0.92 release to still observe that

Re: Clarifications on HBase Durability

2014-10-09 Thread lars hofhansl
Correct on all points. I really need to pick up HBASE-5954 again. What prevented me before was that (a) there was little interest (even from myself in the end) and (b) with the variety of Hadoop versions we supported in 0.94 this was reflection hell. In HBase 1.x or 2.x we will likely only

[ANNOUNCE] HBase 0.94.24 is available for download

2014-10-04 Thread lars hofhansl
The HBase Team is pleased to announce the immediate release of HBase 0.94.24. Download it from your favorite Apache mirror [1]. This release has also been pushed to Apache's maven repository. All previous 0.92.x and 0.94.x releases can be upgraded to 0.94.24 via a rolling upgrade without

Re: single column value filter to find rows greater than a certain date not working

2014-10-03 Thread lars hofhansl
Bytes.toBytes(20110810) Is that exactly how you are storing the dates? As string converted to bytes? Or did you store them as a long converted to bytes? Also note that this is a fairly inefficient way if doing this. If this is the typical access pattern you should put the data in the row key

Re: HBase read performance

2014-10-02 Thread lars hofhansl
Also you may be maxing out the network between the region server and your client - depending on the size of the KVs. What's the bandwidth you see used on both ends? Are you running clients on multiple machines? - Original Message - From: lars hofhansl la...@apache.org To: user

Re: HBase read performance

2014-10-02 Thread lars hofhansl
OK... We might need to investigate this. Any chance that you can provide a minimal test program and instruction about how to set it up. We can do some profiling then. One thing to note is that with scanning HBase cannot use bloom filters to rule out HFiles ahead of time, it needs to look into

Re: hbase row locking

2014-09-28 Thread lars hofhansl
Are anyone aware of any company who does not use the hdfs default policy and flush every WAL sync. It's a trade-off. You'll only lose data when and the wrong three machines die around the same time (you'd have an outage that any block that exists only on these three boxes). Not also that time

Re: hbase row locking

2014-09-27 Thread lars hofhansl
You didn't tell us which version of HBase. HBase is pretty smart about how long it needs to hold locks. For example the flush to the WAL is done without the row lock held. The row lock is only help to create the WAL edit and to add the edit to memstore, then it is released. After that we sync

Re: Is there any way to truncate the hbase table but keep all the regions

2014-09-27 Thread lars hofhansl
If you have a patch I am happy to commit it to 0.94 (if it's not too invasive). -- Lars From: Weichen YE yeweichen2...@gmail.com To: user@hbase.apache.org user@hbase.apache.org Sent: Saturday, September 27, 2014 7:13 PM Subject: Re: Is there any way to

Re: Configuring tombstone purge independent of deleted cell purge

2014-09-23 Thread lars hofhansl
...@gmail.com To: user@hbase.apache.org; lars hofhansl la...@apache.org Sent: Tuesday, September 23, 2014 9:10 AM Subject: Re: Configuring tombstone purge independent of deleted cell purge It does sound like what I'd want (that's why I was trying to use it :) ), but it isn't working as described

Re: Configuring tombstone purge independent of deleted cell purge

2014-09-23 Thread lars hofhansl
And my other problem is that I do not read all emails before I reply. Looks like you resolved it now... All good :) From: lars hofhansl la...@apache.org To: user@hbase.apache.org user@hbase.apache.org Sent: Tuesday, September 23, 2014 10:10 PM Subject: Re

Re: Configuring tombstone purge independent of deleted cell purge

2014-09-22 Thread lars hofhansl
You can use the hbase.hstore.time.to.purge.deletes config option. You can set it globally or per Column Family. This is the description in hbase-default.xml: property namehbase.hstore.time.to.purge.deletes/name value0/value descriptionThe amount of time to delay purging of delete

Re: HBase establishes session with ZooKeeper and close the session immediately

2014-09-19 Thread lars hofhansl
Hi, can you define frequently? I.e. send a larger snippet of the log. Connecting every few minutes would OK, Multiple times per second would be strange. -- Lars From: tobe tobeg3oo...@gmail.com To: user@hbase.apache.org user@hbase.apache.org Sent: Thursday,

Re: Scan vs Parallel scan.

2014-09-13 Thread lars hofhansl
. 2014-09-11 7:50 GMT+02:00 lars hofhansl la...@apache.org: Which version of HBase? Can you show us the code? Your parallel scan with caching 100 takes about 6x as long as the single scan, which is suspicious because you say you have 6 regions. Are you sure you're not accidentally

Re: Scan vs Parallel scan.

2014-09-10 Thread lars hofhansl
Which version of HBase? Can you show us the code? Your parallel scan with caching 100 takes about 6x as long as the single scan, which is suspicious because you say you have 6 regions. Are you sure you're not accidentally scanning all the data in each of your parallel scans? -- Lars

Re: HBase - Performance issue

2014-09-06 Thread lars hofhansl
, can you post a jstack of the processes that experience high wait times? -- Lars From: kiran kiran.sarvabho...@gmail.com To: user@hbase.apache.org; lars hofhansl la...@apache.org Sent: Saturday, September 6, 2014 11:30 AM Subject: Re: HBase - Performance issue

[ANNOUNCE] HBase 0.94.23 is available for download

2014-09-05 Thread lars hofhansl
The HBase Team is pleased to announce the immediate release of HBase 0.94.23. Download it from your favorite Apache mirror [1]. This release has also been pushed to Apache's maven repository. All previous 0.92.x and 0.94.x releases can be upgraded to 0.94.23 via a rolling upgrade without

Re: ttl problem, cells are not deleted

2014-09-02 Thread lars hofhansl
Are these the last versions of a cell? You have MIN_VERSIONS set to 1, meaning HBase will keep at least one version around regardless of whether it is expired by TTL or not. -- Lars From: Serega Sheypak serega.shey...@gmail.com To: user user@hbase.apache.org

Re: state-of-the-art method for merging regions on v0.94

2014-08-28 Thread lars hofhansl
Hey Ted! How many regions (per region server) do you have on average? If it's not too bad you might just be able to increase hbase.hregion.max.filesize to 10 or 20g and bounce all the region servers. Then as you write more data you will fill up the existing regions. Too bad is fuzzy. If you

Re: state-of-the-art method for merging regions on v0.94

2014-08-28 Thread lars hofhansl
Agreed. Bryan, we should pull in your code if that works better. -- Lars From: Andrew Purtell apurt...@apache.org To: user@hbase.apache.org user@hbase.apache.org Cc: Development developm...@mentacapital.com Sent: Thursday, August 28, 2014 12:12 PM Subject:

Re: [ANNOUNCE] New HBase PMC members: Matteo Bertozzi, Nick Dimiduk, and Jeffrey Zhong

2014-08-26 Thread lars hofhansl
Congrats and welcome. Great to have you on board! From: rajeshbabu chintaguntla rajeshbabu.chintagun...@huawei.com To: d...@hbase.apache.org d...@hbase.apache.org; user user@hbase.apache.org Sent: Tuesday, August 26, 2014 1:58 AM Subject: RE: [ANNOUNCE] New

Re: Should scan check the limitation of the number of versions?

2014-08-25 Thread lars hofhansl
Queries of past time ranges only work correctly when KEEP_DELETED_CELLS is enabled for the column families. From: tobe tobeg3oo...@gmail.com To: hbase-dev d...@hbase.apache.org Cc: user@hbase.apache.org user@hbase.apache.org Sent: Monday, August 25, 2014 4:32

Re: HBase 0.94 vs 0.98

2014-08-25 Thread lars hofhansl
Hi Flavio, all new users should go to 0.98. That said, 0.94 is probably still the most widely used version and has been the most battle tested at this point. HBase is open source :) 0.94 will be supported as long as folks still make patches for it. Currently there is still a steady stream of

Re: GC peaks during major compaction

2014-08-21 Thread lars hofhansl
You want to decrease your young gen (defaults to 40% of heap, which is *way* to big for HBase). I wrote the reasoning here: http://hadoop-hbase.blogspot.com/2014/03/hbase-gc-tuning-observations.html (Basically HBase produces a lot of day-to-day garbage that can be collected quickly. You do not

Re: How are split files distributed across Region servers?

2014-08-19 Thread lars hofhansl
I'd change the max file size to 20GB. That'd give you 5000 regions for 100TB. From: Jianshi Huang jianshi.hu...@gmail.com To: user@hbase.apache.org Sent: Monday, August 18, 2014 12:22 PM Subject: Re: How are split files distributed across Region servers? Hi

Re: GC peaks during major compaction

2014-08-14 Thread lars hofhansl
Can send you JVM command line arguments (specifically how you tune the GC)? -- Lars From: yanivG yaniv.yancov...@gmail.com To: user@hbase.apache.org Sent: Thursday, August 14, 2014 9:01 AM Subject: GC peaks during major compaction Hi, We are running CDH

Re: A better way to migrate the whole cluster?

2014-08-14 Thread lars hofhansl
What version of HBase? How are you running CopyTable? A day for 1.8T is not what we would expect. You can definitely take a snapshot and then export the snapshot to another cluster, which will move the actual files; but CopyTable should not be so slow. -- Lars

Re: Re: Any fast way to random access hbase data?

2014-08-13 Thread lars hofhansl
namehfile.block.cache.size/name value0.0/value Yikes. Don't do that. :) Even if your blocks are in the OS cache, upon each single Get HBase needs to re-allocate a new 64k block on the heap (including the index blocks). If you see no chance that a working set of the data fits into the

Re: HBase client hangs after client-side OOM

2014-08-13 Thread lars hofhansl
Hey Ted, so this is a problem with the ZK client, it seems to not clean itself up correctly upon receiving an exception at the wrong moment. Which version of ZK are you using? -- Lars - Original Message - From: Ted Tuttle t...@mentacapital.com To: user@hbase.apache.org

Re: How to verify data from MySQL and HBase

2014-08-12 Thread lars hofhansl
Just in the interest of stating the obvious: Don't write a tool that scans through all the data in MySQL (or HBase) and then looks up each individual row (or even batches of rows) in the other store. That is very inefficient if you have a lot of data. Do it like a merge-join instead: Get

Re: Large discrepancy in hdfs hbase rootdir size after copytable operation.

2014-08-09 Thread lars hofhansl
we should include a better way with HBase, maybe using Merkletrees, or at least hashes of ranges, and compare those. -- Lars From: Colin Kincaid Williams disc...@uw.edu To: user@hbase.apache.org; lars hofhansl la...@apache.org Sent: Saturday, August 9, 2014 2

Re: [VOTE] The 1st HBase 0.98.5 release candidate (RC0) is available, vote closing 8/11/2014

2014-08-08 Thread lars hofhansl
+1 - installed -hadoop2 bin tar ball - checked contents, documentation, checksums, etc - inserted some data, checked with raw-scans, flushed, compacted, inserted again, checked again. All good. - downloaded source tarball, checked contents, CHANGES.txt, etc. -- Lars

[ANNOUNCE] HBase 0.94.22 is available for download

2014-08-08 Thread lars hofhansl
The HBase Team is pleased to announce the immediate release of HBase 0.94.22. Download it from your favorite Apache mirror [1]. This release has also been pushed to Apache's maven repository. All previous 0.92.x and 0.94.x releases can be upgraded to 0.94.22 via a rolling upgrade without

Re: Large discrepancy in hdfs hbase rootdir size after copytable operation.

2014-08-08 Thread lars hofhansl
Hi Colin, you might want to consider upgrading. The current stable version is 0.98.4 (soon .5). Even just going to 0.94 will give a lot of new features, stability, and performance. 0.92.x can be upgraded to 0.94.x without any downtime and without any upgrade steps necessary. For an upgrade to

Re: What is replacement of kv.getBuffer() in HBase0.98?

2014-08-05 Thread lars hofhansl
Hi Anil, the point of getFamilyArray() is that it may return the entire buffer, but it does not *have* to. Everybody should use the new API, so that we can eventually have KeyValues (Cells) that do not have be backed by a single byte[]. (i.e. we can have KeyValues backed by separate row,

Re: Coprocessor beacuse of TTL expired?

2014-07-21 Thread lars hofhansl
It's possible but ... uhm... Tedious. You would have to use the pre flush and compaction scanner open hooks, do the TTL calculation yourself for each KeyValue that passes through the scanner and then act accordingly. Checkout RegionObserver.preCompactScannerOpen(...) and

Re: Coprocessor beacuse of TTL expired?

2014-07-21 Thread lars hofhansl
hofhansl la...@apache.org Sent: Subject: Re: Coprocessor beacuse of TTL expired? I had open 10115 a while back but I think  11054 mostly covers it. But this was exactly the idea behind 10115. 2014-07-21 10:19 GMT-04:00 lars hofhansl la...@apache.org: It's possible but ... uhm... Tedious

Re: Cluster sizing guidelines

2014-07-19 Thread lars hofhansl
host. What are your thoughts on this? -Amandeep On Wed, Jul 16, 2014 at 9:06 AM, lars hofhansl la...@apache.org wrote: This is a somewhat fuzzy art. Some points to consider: 1. All data is replicated three ways. Or in other words, if you run three RegionServer/Datanodes each

Re: Cluster sizing guidelines

2014-07-19 Thread lars hofhansl
on this? -Amandeep On Wed, Jul 16, 2014 at 9:06 AM, lars hofhansl la...@apache.org wrote: This is a somewhat fuzzy art. Some points to consider: 1. All data is replicated three ways. Or in other words, if you run three RegionServer/Datanodes each machine will get 100

Re: Cluster sizing guidelines

2014-07-16 Thread lars hofhansl
This is a somewhat fuzzy art. Some points to consider: 1. All data is replicated three ways. Or in other words, if you run three RegionServer/Datanodes each machine will get 100% of the writes. If you run 6, each gets 50% of the writes. From that aspect HBase clusters with less than 9

Re: HBase Upgrade from 0.94.x to 0.98.x: How to do rollback?

2014-07-09 Thread lars hofhansl
Not sure many people have tried that here. Sounds like it should work, though. Any chance you could set up a single node cluster (i.e. run a single node HDFS and HBase on the same machine) and try? Should not be hard to setup, and would be valuable for folks here to know. -- Lars

Re: FileNotFoundException in bulk load

2014-07-08 Thread lars hofhansl
From: Amit Sela am...@infolinks.com To: user@hbase.apache.org; lars hofhansl la...@apache.org Sent: Tuesday, July 8, 2014 9:38 AM Subject: Re: FileNotFoundException in bulk load I think Lars is right. We ended up with errors in the RAID on that regionserver the next day. Still, shouldn't HDFS

Re: FileNotFoundException in bulk load

2014-07-06 Thread lars hofhansl
If we do further discussion there we should reopen the jira. Fine if the exception is identical, or open a new one if this is a different one. At first blush this looks a bit like a temporary unavailability of HDFS. -- Lars From: Ted Yu yuzhih...@gmail.com

Re: Can 0.94.9 enable multi-thread for memstore flush

2014-07-05 Thread lars hofhansl
I closed it because nobody seemed to care. If there's rekindled interest for this in 0.94 I would not be opposed backporting that feature. -- Lars From: Ted Yu yuzhih...@gmail.com To: user@hbase.apache.org user@hbase.apache.org Sent: Thursday, July 3, 2014

Re: How Hbase achieves efficient random access?

2014-07-05 Thread lars hofhansl
What Ted and Intea said. Are you asking out of interest or do you see performance issues? One issue is that the KeyValues (KVs) in the blocks is not indexed. KVs are variable length and hence once a block is loaded it needs to be searched linearly in order to find the KV (or determine its

Re: How Hbase achieves efficient random access?

2014-07-05 Thread lars hofhansl
Yeah. A block is quickly located from a few index block that are always cached, the block itself can be cached. A block is quickly searched for the KV in question. -- Lars From: yl wu wuyl6...@gmail.com To: user@hbase.apache.org; lars hofhansl la...@apache.org

[ANNOUNCE] HBase 0.94.21 is available for download

2014-07-03 Thread lars hofhansl
The HBase Team is pleased to announce the immediate release of HBase 0.94.21. Download it from your favorite Apache mirror [1]. This release has also been pushed to Apache's maven repository. All previous 0.92.x and 0.94.x releases can upgraded to 0.94.21 via a rolling upgrade without downtime,

Re: Disk space leak when using HBase and HDFS ShortCircuit

2014-06-27 Thread lars hofhansl
Sounds scary! HDP 2.1 ships with Hadoop 2.4.0, right? Sounds like a serious HDFS bug with SSR. Maybe we need to at least document this behavior and recommend settings in the HBase book. -- Lars - Original Message - From: Giuseppe Reina g.re...@gmail.com To: user@hbase.apache.org Cc:

Re: Does compression ever improve performance?

2014-06-15 Thread lars hofhansl
Unfortunately it's not quite that simple. Currently the HBase scanning guts expect all KeyValues to be laid out in memory in a continuous way, so with encoding they need to be copied in memory to make it... We're working on fixing it, but this is currently the way it is. So on the one hand you

Re: [CANCELLED] [VOTE] The 2nd HBase 0.98.3 release candidate (RC1) is available, vote closing 6/1/2014

2014-05-31 Thread lars hofhansl
Thanks for being thorough here! I'd much rather do some work multiple times than seeing a new point release with known issues in it. -- Lars From: Andrew Purtell apurt...@apache.org To: d...@hbase.apache.org d...@hbase.apache.org Cc: user@hbase.apache.org

[ANNOUNCE] HBase 0.94.20 is available for download

2014-05-29 Thread lars hofhansl
The HBase Team is pleased to announce the immediate release of HBase 0.94.20. Download it from your favorite Apache mirror [1]. This release has also been pushed to Apache's maven repository. (Thanks Stack for working out the process with git) All previous 0.92.x and 0.94.x releases can upgraded

Re: [VOTE] The 1st HBase 0.98.3 release candidate (RC0) is available, vote closing 5/31/2014

2014-05-24 Thread lars hofhansl
+1 - grabbed the hadoop-1-bin tarball, checked checksums - unpacked, checked documentation and contents - started in local mode, checked UI pages - created some tables, wrote some data - flushed, compacted, scanned All good. Might do some more extensive tests later. -- Lars - Original

Re: Replication chunk size

2014-05-24 Thread lars hofhansl
The replication code continuously reads edits from the WAL and ships them over when (1) there's nothing else to read or (2) the chunksize is reached. The chunksize just limit the maximum size of the RPC request to ship the edits over. -- Lars - Original Message - From: nipra

Re: [VOTE] The 1st HBase 0.98.3 release candidate (RC0) is available, vote closing 5/31/2014

2014-05-23 Thread lars hofhansl
Did you build this yourself? Apache jenkins seems to take a holiday. -- Lars From: Andrew Purtell apurt...@apache.org To: d...@hbase.apache.org d...@hbase.apache.org Cc: user@hbase.apache.org user@hbase.apache.org Sent: Friday, May 23, 2014 8:55 PM Subject:

  1   2   3   4   5   6   7   8   >