unable to limit number of records using scan with hbase version 0.94.15-cdh4.7.0

2015-02-25 Thread Gurpreet Singh Bawa
Hi Team I am trying to apply a limit on my scan query, but cannot find a suitable method in the Scan class that will help me do the same. I am using org.apache.hbase:hbase 0.94.15-cdh4.7.0. Kindly let me know if there is some other method available for limiting the records. Thanks Gurpreet

Re: HBase with opentsdb creates huge .tmp file runs out of hdfs space

2015-02-25 Thread brady2
Hi Sathya and Nick, Here are the stack traces of the region server dumps when the huge .tmp files are created: https://drive.google.com/open?id=0B1tQg4D17jKQNDdFZkFQTlg4ZjQauthuser=0 https://drive.google.com/open?id=0B1tQg4D17jKQNDdFZkFQTlg4ZjQauthuser=0 As background we are not using

Re: table splitting - how to check

2015-02-25 Thread Jean-Marc Spaggiari
Hi Marcelo, Truncate also removes the regions boundaries. You can use truncate_preserve if you want to keep your region splits. Not sure if it's available in 0.96... Also, I don't think you can look at the splits from the shell command... JM 2015-02-25 10:09 GMT-05:00 Marcelo Valle (BLOOMBERG/

Re: table splitting - how to check

2015-02-25 Thread Ted Yu
truncate_preserve.rb was added by HBASE-5525 which went into 0.95. So it should be in 0.96 Cheers On Wed, Feb 25, 2015 at 7:14 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Marcelo, Truncate also removes the regions boundaries. You can use truncate_preserve if you want to keep

Re: Table.get(ListGet) overwhelms several RSs

2015-02-25 Thread Ted Yu
Was the underlying table balanced (meaning its regions spread evenly across region servers) ? What release of HBase are you using ? Cheers On Wed, Feb 25, 2015 at 7:08 AM, Ted Tuttle t...@mentacapital.com wrote: Hello- In the last week we had multiple times where we lost 5 of 8 RSs in the

table splitting - how to check

2015-02-25 Thread Marcelo Valle (BLOOMBERG/ LONDON)
Hi, I have created an HBase table just like that: t = create 'HBaseSerialWritesPOC', 'user_id_ts', {NAME = 'alnfo'}, {SPLITS = ['1000', '2000', '3000', '4000', '5000',

Re: table splitting - how to check

2015-02-25 Thread Marcelo Valle (BLOOMBERG/ LONDON)
Thanks! From: yuzhih...@gmail.com Subject: Re: table splitting - how to check truncate_preserve.rb was added by HBASE-5525 which went into 0.95. So it should be in 0.96 Cheers On Wed, Feb 25, 2015 at 7:14 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Marcelo, Truncate also

Table.get(ListGet) overwhelms several RSs

2015-02-25 Thread Ted Tuttle
Hello- In the last week we had multiple times where we lost 5 of 8 RSs in the space of a few minutes because of slow GCs. We traced this back to a client calling Table.get(ListGet gets) with a collection containing ~4000 individual gets. We've worked around this by limiting the number of Gets

Re: unable to limit number of records using scan with hbase version 0.94.15-cdh4.7.0

2015-02-25 Thread Jean-Marc Spaggiari
Hi Gurpreet, Can you share a part of the code you have to do this scan? What do you mean by a limit? Can't you just stop calling next() if you get enough lines? 2015-02-25 3:56 GMT-05:00 Gurpreet Singh Bawa gurpreet.b...@snapdeal.com: Hi Team I am trying to apply a limit on my scan query,

Re: unable to limit number of records using scan with hbase version 0.94.15-cdh4.7.0

2015-02-25 Thread Ted Yu
Gurpreet: Would the following method of Scan serve your needs ? * To limit the maximum number of values returned for each call to next(), * execute {@link #setBatch(int) setBatch}. Cheers On Wed, Feb 25, 2015 at 6:43 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Gurpreet, Can

Re: table splitting - how to check

2015-02-25 Thread Sean Busbey
You can get the list of splits from the shell, but only by peeking behind the curtain of the API. jruby-1.7.3 :012 create 'example_table', 'f1', SPLITS = [10, 20, 30, 40] 0 row(s) in 0.5500 seconds = Hbase::Table - example_table jruby-1.7.3 :013

HBase connection pool

2015-02-25 Thread Marcelo Valle (BLOOMBERG/ LONDON)
In HBase API, does 1 HTable object means 1 connection to each region server (just for 1 table)? The docs say (http://hbase.apache.org/0.94/apidocs/org/apache/hadoop/hbase/client/HTable.html): This class is not thread safe for reads nor write. I got confused, as I saw there is a HTablePool

oldWALs: what it is and how can I clean it?

2015-02-25 Thread Madeleine Piffaretti
Hi all, We are running out of space in our small hadoop cluster so I was checking disk usage on HDFS and I saw that most of the space was occupied by the* /hbase/oldWALs* folder. I have checked in the HBase Definitive Book and others books, web-site and I have also search my issue on google but

Re: Table.get(ListGet) overwhelms several RSs

2015-02-25 Thread Nick Dimiduk
How large is your region server heap? What's your setting for hfile.block.cache.size? Can you identify which region is being burned up (i.e., is it META?) It is possible for a hot region to act as a death pill that roams around the cluster. We see this with the meta region with poorly-behaved

RE: Table.get(ListGet) overwhelms several RSs

2015-02-25 Thread Ted Tuttle
Hard to say how balanced the table is. We have a mixed requirement where we want some locality for timeseries queries against clusters of information. However the clusters in a table are should be well distributed if the dataset is large enough. The query in question killed 5 RSs so I am

Re: unable to limit number of records using scan with hbase version 0.94.15-cdh4.7.0

2015-02-25 Thread Devaraja Swami
Gurpreet, you can check this post on stack overflow: http://stackoverflow.com/questions/14002948/command-like-sql-limit-in-hbase/28130609 On Wed, Feb 25, 2015 at 7:27 AM, Ted Yu yuzhih...@gmail.com wrote: Gurpreet: Would the following method of Scan serve your needs ? * To limit the maximum

Re: oldWALs: what it is and how can I clean it?

2015-02-25 Thread Nishanth S
Do you have replication turned on in hbase and if so is your slave consuming the replicated data?. -Nishanth On Wed, Feb 25, 2015 at 10:19 AM, Madeleine Piffaretti mpiffare...@powerspace.com wrote: Hi all, We are running out of space in our small hadoop cluster so I was checking disk

Re: HBase connection pool

2015-02-25 Thread Nick Dimiduk
Okay, looks like you're using a implicitly managed connection. It should be fine to share a single config instance across all threads. The advantage of HTablePool over this approach is that the number of HTables would be managed independently from the number of Threads. This may or not be a

RE: Table.get(ListGet) overwhelms several RSs

2015-02-25 Thread Ted Tuttle
Heaps are 16G w/ hfile.block.cache.size = 0.5 Machines have 32G onboard and we used to run w/ 24G heaps but reduced them to lower GC times. Not so sure about which regions were hot. And I don't want to repeat and take down my cluster again :) What I know: 1) The request was about 4000 gets.

Re: HBase connection pool

2015-02-25 Thread Marcelo Valle (BLOOMBERG/ LONDON)
Hi Nick, I am using HBase version 0.96, I sent the link from version 0.94 because I haven't found the java API docs for 0.96, sorry about that. I have created the HTable directly from the config object, as follows: this.tlConfig = new ThreadLocalConfiguration() { @Override protected

Re: Table.get(ListGet) overwhelms several RSs

2015-02-25 Thread Nick Dimiduk
How large are the KeyValues? Can you estimate how much data you're materializing for this query? HBase's RPC implementation does not currently support streaming, so the entire result set (all 4000 objects) will be held in memory to service the request. This is a known issue (I'm lacking on a JIRA

[ANNOUNCE] Apache Phoenix 4.3 released

2015-02-25 Thread James Taylor
The Apache Phoenix team is pleased to announce the immediate availability of the 4.3 release. Highlights include: - functional indexes [1] - map-reduce over Phoenix tables [2] - cross join support [3] - query hint to force index usage [4] - set HBase properties through ALTER TABLE - ISO-8601 date

Re: HBase with opentsdb creates huge .tmp file runs out of hdfs space

2015-02-25 Thread sathyafmt
Hi John, What's the fix (committed on Jan11) you mentioned ? Is it in opentsdb/hbase ? Do you have a JIRA #.. ? Thanks On Wed, Feb 25, 2015 at 5:49 AM, brady2 [via Apache HBase] ml-node+s679495n4068627...@n3.nabble.com wrote: Hi Sathya and Nick, Here are the stack traces of the region

Re: HBase scan time range, inconsistency

2015-02-25 Thread Stephen Durfey
What's the TTL setting for your table ? Which hbase release are you using ? Was there compaction in between the scans ? Thanks The TTL is set to the max. The HBase version is 0.94.6-cdh4.4.0. I don’t want to say compactions aren’t a factor, but the jobs are short lived (4-5 minutes), and

Re: Table.get(ListGet) overwhelms several RSs

2015-02-25 Thread Ted Yu
bq. The 4000 keys are likely contiguous and therefore probably represent entire regions In that case you can convert multi-get's to Scan with proper batch size and start/stop rows. Cheers On Wed, Feb 25, 2015 at 10:16 AM, Ted Tuttle t...@mentacapital.com wrote: Heaps are 16G w/

Re: HBase connection pool

2015-02-25 Thread Nick Dimiduk
Hi Marcelo, First thing, to be clear, you're working with a 0.94 release? The reason I ask is we've been doing some work in this area to improve things, so semantics may be slightly different between 0.94, 0.98, and 1.0. How are you managing the HConnection object (or are you)? How are you

Re: [ANNOUNCE] Apache HBase 1.0.0 is now available for download

2015-02-25 Thread zhou_shuaif...@sina.com
Great! zhou_shuaif...@sina.com From: Enis Söztutar Date: 2015-02-24 16:28 To: hbase-user; d...@hbase.apache.org Subject: [ANNOUNCE] Apache HBase 1.0.0 is now available for download The HBase Team is pleased to announce the immediate release of HBase 1.0.0. Download it from your favorite

Re: HBase scan time range, inconsistency

2015-02-25 Thread Stephen Durfey
Are you writing any Deletes? Are you writing any duplicates? No physical deletes are occurring in my data, and there is a very real possibility of duplicates. How is the partitioning done? The key structure would be /partition_id/person_id I'm dealing with clinical data, with a data

Re: HBase scan time range, inconsistency

2015-02-25 Thread Sean Busbey
Are you writing any Deletes? Are you writing any duplicates? How is the partitioning done? What does the entire key structure look like? Are you doing the column filtering with a custom filter or one of the prepackaged ones? On Wed, Feb 25, 2015 at 12:57 PM, Stephen Durfey sjdur...@gmail.com

Re: [ANNOUNCE] Apache HBase 1.0.0 is now available for download

2015-02-25 Thread Lars George
Great work everyone! Congratulations, this is the most awesome community to be in. Some coverage: - https://blogs.apache.org/foundation/entry/the_apache_software_foundation_announces72 - http://www.heise.de/developer/meldung/Big-Data-HBase-1-0-erschienen-2558708.html On Tue, Feb 24, 2015 at