Re: Sporadic memstore slowness for Read Heavy workloads

2014-01-27 Thread Varun Sharma
Actually, I now have another question because of the way our work load is structured. We use a wide schema and each time we write, we delete the entire row and write a fresh set of columns - we want to make sure no old columns survive. So, I just want to see if my picture of the memstore at this po

Re: Sporadic memstore slowness for Read Heavy workloads

2014-01-27 Thread Vladimir Rodionov
Varun, There is no need to open new JIRA - there are two already: https://issues.apache.org/jira/browse/HBASE-9769 https://issues.apache.org/jira/browse/HBASE-9778 Both with patches, you can grab and test them. -Vladimir On Mon, Jan 27, 2014 at 9:36 PM, Varun Sharma wrote: > Hi lars, > > Tha

Re: Sporadic memstore slowness for Read Heavy workloads

2014-01-27 Thread Varun Sharma
Hi lars, Thanks for the background. It seems that for our case, we will have to consider some solution like the Facebook one, since the next column is always the next one - this can be a simple flag. I am going to raise a JIRA and we can discuss there. Thanks Varun On Sun, Jan 26, 2014 at 3:43

Re: Balancer switch runs causing problems

2014-01-27 Thread Stack
/** * A janitor for the catalog tables. Scans the .META. catalog * table on a period looking for unused regions to garbage collect. */ class CatalogJanitor extends Chore { private static final Log LOG = LogFactory.getLog(CatalogJanitor.class.getName()); private final Server server; privat

Re: Balancer switch runs causing problems

2014-01-27 Thread Varun Sharma
But continue to see reads on META - no idea why ? On Mon, Jan 27, 2014 at 8:52 PM, Varun Sharma wrote: > We are not seeing any balancer related logs btw anymore... > > > On Mon, Jan 27, 2014 at 8:23 PM, Ted Yu wrote: > >> Looking at the changes since release 0.94.7, I found: >> >> HBASE-8655 B

Re: Balancer switch runs causing problems

2014-01-27 Thread Varun Sharma
We are not seeing any balancer related logs btw anymore... On Mon, Jan 27, 2014 at 8:23 PM, Ted Yu wrote: > Looking at the changes since release 0.94.7, I found: > > HBASE-8655 Backport to 94 - HBASE-8346(Prefetching .META. rows in case only > when useCache is set to true) > HBASE-8698 potentia

Re: Balancer switch runs causing problems

2014-01-27 Thread Ted Yu
Looking at the changes since release 0.94.7, I found: HBASE-8655 Backport to 94 - HBASE-8346(Prefetching .META. rows in case only when useCache is set to true) HBASE-8698 potential thread creation in MetaScanner.metaScan If possible, can you upgrade your cluster ? Cheers On Mon, Jan 27, 2014 a

Re: Balancer switch runs causing problems

2014-01-27 Thread Ted Yu
Do you see the following (from HConnectionManager$HConnectionImplementation# locateRegionInMeta) ? if (LOG.isDebugEnabled()) { LOG.debug("locateRegionInMeta parentTable=" + Bytes.toString(parentTable) + ", metaLocation=" + ((metaLocation ==

Re: Balancer switch runs causing problems

2014-01-27 Thread Varun Sharma
Actually not sometimes but we are always seeing a large # of .META. reads every 5 minutes. On Mon, Jan 27, 2014 at 7:47 PM, Varun Sharma wrote: > The default one with 0.94.7... - I dont see any of those logs. Also we > turned off the balancer switch - but looks like sometimes we still see a > l

Re: Balancer switch runs causing problems

2014-01-27 Thread Varun Sharma
The default one with 0.94.7... - I dont see any of those logs. Also we turned off the balancer switch - but looks like sometimes we still see a large number of requests to .META. table every 5 minutes. Varun On Mon, Jan 27, 2014 at 7:37 PM, Ted Yu wrote: > In HMaster#balance(), we have (same f

Re: Balancer switch runs causing problems

2014-01-27 Thread Ted Yu
In HMaster#balance(), we have (same for 0.94 and 0.96): for (RegionPlan plan: plans) { LOG.info("balance " + plan); Do you see such log in master log ? On Mon, Jan 27, 2014 at 7:26 PM, Varun Sharma wrote: > We are seeing one other issue with high read latency (p99 etc.) on o

Re: Balancer switch runs causing problems

2014-01-27 Thread Jean-Marc Spaggiari
Hi Varun, Which balancer have you configured, and which version of HBase are you using? JM 2014-01-27 Varun Sharma > We are seeing one other issue with high read latency (p99 etc.) on one of > our read heavy hbase clusters which is correlated with the balancer runs - > every 5 minutes. > > I

Balancer switch runs causing problems

2014-01-27 Thread Varun Sharma
We are seeing one other issue with high read latency (p99 etc.) on one of our read heavy hbase clusters which is correlated with the balancer runs - every 5 minutes. If there is no balancing to do, does the balancer only scan the table every 5 minutes - does it do anything on top of that if the re

Re: [VOTE] The 1st HBase 0.98.0 release candidate is available for download

2014-01-27 Thread tsuna
On Mon, Jan 27, 2014 at 10:16 AM, Andrew Purtell wrote: > Let me vote -1 on 0.98.0RC0 on account of HBASE-10422. Sorry I didn't mean to sink the RC just for this. Just thought that I'd mention it because generally somebody else will come with a real / good reason why to sink the first RC of any

Re: HBase 6x bigger than raw data

2014-01-27 Thread Ted Yu
Yes. On Mon, Jan 27, 2014 at 3:34 PM, Koert Kuipers wrote: > if compression is already enabled on a column family, do i understand it > correctly that the main benefit of DATA_BLOCK_ENCODING is in memory? > > > On Mon, Jan 27, 2014 at 6:02 PM, Nick Xie > wrote: > > > Thanks all for the informa

Re: HBase 6x bigger than raw data

2014-01-27 Thread Koert Kuipers
if compression is already enabled on a column family, do i understand it correctly that the main benefit of DATA_BLOCK_ENCODING is in memory? On Mon, Jan 27, 2014 at 6:02 PM, Nick Xie wrote: > Thanks all for the information. Appreciated!! I'll take a look and try. > > Thanks, > > Nick > > > > >

Re: HBase 6x bigger than raw data

2014-01-27 Thread Ted Yu
Enabling compression (http://hbase.apache.org/book.html#compression) is separate from data block encoding (HBASE-4218). Cheers On Mon, Jan 27, 2014 at 2:59 PM, Tom Brown wrote: > Does enabling compression include prefix compression (HBASE-4218), or is > there a separate switch for that? > > --

Re: HBase 6x bigger than raw data

2014-01-27 Thread Nick Xie
Thanks all for the information. Appreciated!! I'll take a look and try. Thanks, Nick On Mon, Jan 27, 2014 at 2:43 PM, Vladimir Rodionov wrote: > Overhead of storing small values is quite high in HBase unless you use > DATA_BLOCK_ENCODING > (not available in 0.92). I recommend you moving to 0

Re: HBase 6x bigger than raw data

2014-01-27 Thread Tom Brown
Does enabling compression include prefix compression (HBASE-4218), or is there a separate switch for that? --Tom On Mon, Jan 27, 2014 at 3:48 PM, Ted Yu wrote: > To make better use of block cache, see: > > HBASE-4218 Data Block Encoding of KeyValues (aka delta encoding / prefix > compression)

Re: HBase 6x bigger than raw data

2014-01-27 Thread Ted Yu
To make better use of block cache, see: HBASE-4218 Data Block Encoding of KeyValues (aka delta encoding / prefix compression) which is in 0.94 and above To reduce size of HFiles, please see: http://hbase.apache.org/book.html#compression On Mon, Jan 27, 2014 at 2:40 PM, Nick Xie wrote: > Tom,

RE: HBase 6x bigger than raw data

2014-01-27 Thread Vladimir Rodionov
Overhead of storing small values is quite high in HBase unless you use DATA_BLOCK_ENCODING (not available in 0.92). I recommend you moving to 0.94.latest. See: https://issues.apache.org/jira/browse/HBASE-4218 Best regards, Vladimir Rodionov Principal Platform Engineer Carrier IQ, www.carrieriq.c

Re: HBase 6x bigger than raw data

2014-01-27 Thread Nick Xie
Tom, Yes, you are right. According to this analysis ( http://prafull-blog.blogspot.in/2012/06/how-to-calculate-record-size-of-hbase.html) if it is right, then the overhead is quite big if the cell value occupies a small portion. In the analysis in that link, the overhead is actually 10x(the r

Re: HBase 6x bigger than raw data

2014-01-27 Thread Nick Xie
Hi Ted, it is 0.92.1. Does the version matter? Thanks, Nick On Mon, Jan 27, 2014 at 2:32 PM, Ted Yu wrote: > Which HBase release are you using ? > > > On Mon, Jan 27, 2014 at 2:12 PM, Nick Xie > wrote: > > > I'm importing a set of data into HBase. The CSV file contains 82 entries > > per li

Re: HBase 6x bigger than raw data

2014-01-27 Thread Ted Yu
Which HBase release are you using ? On Mon, Jan 27, 2014 at 2:12 PM, Nick Xie wrote: > I'm importing a set of data into HBase. The CSV file contains 82 entries > per line. Starting with 8 byte ID, followed by 16 byte date and the rest > are 80 numbers with 4 bytes each. > > The current HBase sc

Re: HBase 6x bigger than raw data

2014-01-27 Thread Tom Brown
I believe each cell stores its own copy of the entire row key, column qualifier, and timestamp. Could that account for the increase in size? --Tom On Mon, Jan 27, 2014 at 3:12 PM, Nick Xie wrote: > I'm importing a set of data into HBase. The CSV file contains 82 entries > per line. Starting wi

HBase 6x bigger than raw data

2014-01-27 Thread Nick Xie
I'm importing a set of data into HBase. The CSV file contains 82 entries per line. Starting with 8 byte ID, followed by 16 byte date and the rest are 80 numbers with 4 bytes each. The current HBase schema is: ID as row key, date as a 'date' family with 'value' qualifier, the rest is in another fam

Re: user_permission error - undefined method 'getTable'

2014-01-27 Thread Ted Yu
Alex: Thanks for the quick response. On Mon, Jan 27, 2014 at 1:41 PM, Alex Nastetsky wrote: > That fixed it, thanks! > > > On Mon, Jan 27, 2014 at 4:21 PM, Ted Yu wrote: > > > Thanks for reporting this, Alex. > > > > I logged HBASE-10426 and put up a patch. > > > > Mind giving it a try ? > > >

Re: user_permission error - undefined method 'getTable'

2014-01-27 Thread Alex Nastetsky
That fixed it, thanks! On Mon, Jan 27, 2014 at 4:21 PM, Ted Yu wrote: > Thanks for reporting this, Alex. > > I logged HBASE-10426 and put up a patch. > > Mind giving it a try ? > > > On Mon, Jan 27, 2014 at 12:46 PM, Alex Nastetsky >wrote: > > > Hi, > > > > When I run the "user_permission" com

Re: HBase MapReduce with setup function problem

2014-01-27 Thread Ted Yu
I agree that we should find the cause for why initialization got stuck. I noticed empty catch block: } catch (IOException e) { } Can you add some logging there to see what might have gone wrong ? Thanks On Mon, Jan 27, 2014 at 11:56 AM, daidong wrote: > Dear Ted, Thanks very much

Re: user_permission error - undefined method 'getTable'

2014-01-27 Thread Ted Yu
Thanks for reporting this, Alex. I logged HBASE-10426 and put up a patch. Mind giving it a try ? On Mon, Jan 27, 2014 at 12:46 PM, Alex Nastetsky wrote: > Hi, > > When I run the "user_permission" command on a table, I get this error: > > hbase(main):010:0> create 'foo','bar' > 0 row(s) in 0.57

user_permission error - undefined method 'getTable'

2014-01-27 Thread Alex Nastetsky
Hi, When I run the "user_permission" command on a table, I get this error: hbase(main):010:0> create 'foo','bar' 0 row(s) in 0.5780 seconds => Hbase::Table - foo hbase(main):011:0> user_permission 'foo' User Table,Family,Qualifier:Permission *ERROR: undefined

Re: HBase MapReduce with setup function problem

2014-01-27 Thread daidong
Dear Ted, Thanks very much for your reply! Yes. MultiTableInputFormat may work here, but i still want to know how to connect a hbase table inside MapReduce applications. Because i may need also write to tables inside map function. Do you know why previous mr application does not work? Because the

Re: Hbase tuning for heavy write cluster

2014-01-27 Thread Stack
On Sun, Jan 26, 2014 at 4:13 PM, Rohit Dev wrote: > Hi Lars, > > I changed java heap to 31GB and also reduced memstore flush size to 256MB > (down from 512MB). All of the servers are running quiet, except for 1. > > - This 1 particular server is doing ~100 Memstore flushes in every 5 Mins, > that

Re: [VOTE] The 1st HBase 0.98.0 release candidate is available for download

2014-01-27 Thread Stack
Maybe spin it tomorrow. I was going to take a look at 0.98 today. Let me see if I can turn up something else? St.Ack On Mon, Jan 27, 2014 at 10:16 AM, Andrew Purtell wrote: > Let me vote -1 on 0.98.0RC0 on account of HBASE-10422. > > I will spin a new RC today. > > > On Sat, Jan 25, 2014 at 11

Re: [VOTE] The 1st HBase 0.98.0 release candidate is available for download

2014-01-27 Thread Andrew Purtell
Let me vote -1 on 0.98.0RC0 on account of HBASE-10422. I will spin a new RC today. On Sat, Jan 25, 2014 at 11:32 AM, Andrew Purtell wrote: > The 1st HBase 0.98.0 release candidate (RC0) is available for download at > http://people.apache.org/~apurtell/0.98.0RC0/ and Maven artifacts are > also a

Re: [VOTE] The 1st HBase 0.98.0 release candidate is available for download

2014-01-27 Thread Andrew Purtell
This was obnoxious, sorry about that Benoit. I will sink this RC and spin another one today. On Sun, Jan 26, 2014 at 6:37 PM, tsuna wrote: > If this RC sinks, can you please include this one in 0.98.0: > https://issues.apache.org/jira/browse/HBASE-10422 > > -- > Benoit "tsuna" Sigoure > --

is hbase-6580 in 94.7?

2014-01-27 Thread Tianying Chang
Hi, I trying to use Hconnection.getTable() instead of HTablePool since it is being deprecated. But we are using HBase 94.7 in our production. Seems it is only after 94.11? https://issues.apache.org/jira/browse/HBASE-6580 Thanks Tian-Ying

Re: HBase MapReduce with setup function problem

2014-01-27 Thread Ted Yu
Have you considered using MultiTableInputFormat ? Cheers On Mon, Jan 27, 2014 at 9:14 AM, daidong wrote: > Dear all, > > I am writing a MapReduce application processing HBase table. In each map, > it needs to read data from another HBase table, so i use the 'setup' > function to initialize t

Re: [VOTE] The 1st HBase 0.98.0 release candidate is available for download

2014-01-27 Thread Stack
I committed your patch on all branches Benoit. It'll make the next RC and any releases from here on out. St.Ack On Sun, Jan 26, 2014 at 6:37 PM, tsuna wrote: > If this RC sinks, can you please include this one in 0.98.0: > https://issues.apache.org/jira/browse/HBASE-10422 > > -- > Benoit "tsun

Re: [VOTE] The 1st HBase 0.98.0 release candidate is available for download

2014-01-27 Thread Stack
I 'closed' it for you Andrew and it looks good now (I tested with https://github.com/saintstack/hbase-downstreamer). That you have to 'close' is in the doc but the releasing section is long and tricky [1] and likely a little stale too. St.Ack 1. "..While in this 'open' state you can check out wh

HBase MapReduce with setup function problem

2014-01-27 Thread daidong
Dear all, I am writing a MapReduce application processing HBase table. In each map, it needs to read data from another HBase table, so i use the 'setup' function to initialize the HTable instance like this: @Override public void setup(Context context){ Configuration conf = HBaseConf

Re: [VOTE] The 1st HBase 0.98.0 release candidate is available for download

2014-01-27 Thread Ted Yu
Andy: I tried to view https://repository.apache.org/content/repositories/orgapachehbase-1001 but the browser gave me: 404 - ItemNotFoundException Repository "orgapachehbase-1001 (staging: open)" [id=orgapachehbase-1001] exists but is not exposed. org.sonatype.nexus.proxy.ItemNotFoundException: R

Re: Off-heap block cache fails in 0.94.6

2014-01-27 Thread Dean
Hi Ram, We'll give it a shot, thanks! -Dean

client disconnects when calling HBase endpoint

2014-01-27 Thread david pocivalnik
Hi everybody, I've posted a question on SO, but I think it's better off here, maybe you can give me some ideas/thoughts on the problem? here's the question: http://stackoverflow.com/questions/21312297/client-disconnects-when-calling-hbase-endpoint Thanks in advance