Re: HBase read performance

2014-10-03 Thread Qiang Tian
Regarding to profiling, Andrew introduced http://www.brendangregg.com/blog/2014-06-12/java-flame-graphs.html months ago. processCallTime comes from RpcServer#call, so it looks good? I have a suspect: https://issues.apache.org/jira/browse/HBASE-11306 how many processes do you have for your 2000 t

Re: How to make a given table spread evenly across the cluster

2014-10-03 Thread Qiang Tian
according to the 2 pictures, looks the balancer not run? On Fri, Oct 3, 2014 at 1:45 AM, Ted Yu wrote: > See also > HBASE-12139 StochasticLoadBalancer doesn't work on large lightly loaded > clusters > > Cheers > > On Thu, Oct 2, 2014 at 10:35 AM, Jean-Marc Spaggiari < > jean-m...@spaggiari.org>

Re: Get default split policy

2014-10-03 Thread Serega Sheypak
Hi, I do can get it from Java API I was wandering is there any chance to get it use shell. 2014-10-02 23:48 GMT+04:00 Ted Yu : > Please see http://hbase.apache.org/book.html#arch.region.splits > > On Thu, Oct 2, 2014 at 12:46 PM, Serega Sheypak > wrote: > > > Hi, is that true that I can set spl

Re: Get default split policy

2014-10-03 Thread Serega Sheypak
HTableDescriptor descriptor = admin.getTableDescriptor(Bytes.toBytes("my_table")); System.out.println("descriptor.getRegionSplitPolicyClassName() : " + descriptor.getRegionSplitPolicyClassName()); prints: descriptor.getRegionSplitPolicyClassName() : null What do I do wrong? 2014-10-03 12:53

Re: Get default split policy

2014-10-03 Thread Ted Yu
getRegionSplitPolicyClassName() calls getValue(SPLIT_POLICY) and: public String getValue(String key) { byte[] value = getValue(Bytes.toBytes(key)); if (value == null) return null; This means that split policy wasn't set for this table. The following global setting would be eff

Re: Get default split policy

2014-10-03 Thread Ted Yu
Have you tried the 'describe' command ? In case SPLIT_POLICY property isn't set, global config from hbase-default.xml would be effective. Cheers On Fri, Oct 3, 2014 at 1:53 AM, Serega Sheypak wrote: > Hi, I do can get it from Java API I was wandering is there any chance to > get it use shell.

Re: Get default split policy

2014-10-03 Thread Serega Sheypak
I'm on Cloudera CDH 4.6 There is no such property, I tried to find it in - Cloudera Manager UI - in RegionServer ui, there is a link to get configuration - in Cloudera Manager UI -> RegionServer role -> processes -> configuration. Here Cloudera Managers shows configs used by running processes. no

Re: Get default split policy

2014-10-03 Thread Jean-Marc Spaggiari
Hi Serega, You don't see it because it's not modified. It just uses the default value. CDH 4.6 is hbase-0.94.15. So you have org.apache.hadoop.hbase.regionserver.IncreasingToUpperBoundRegionSplitPolicy execpt if you have overwritten this property in your table. http://grepcode.com/file/repo1.mav

Re: How to make a given table spread evenly across the cluster

2014-10-03 Thread Jean-Marc Spaggiari
It depends on the time between the 2 screen-shots. But if there is more than 5 minutes (default config), then you are correct and it should have been a minimum balanced. 2014-10-03 3:30 GMT-04:00 Qiang Tian : > according to the 2 pictures, looks the balancer not run? > > On Fri, Oct 3, 2014 at 1

Re: BulkLoad 200GB table with one region. Is it OK?

2014-10-03 Thread Jean-Marc Spaggiari
Even if you an 100 files. HBase will still need to read them to split then. Each file might contains keys for the 2 regions, so HBase will read 200GB, and write 100GB each side. Last, I don't think the max file size will have any impact on the BulkLoad side. It's the way you generate your file whi

Re: Get default split policy

2014-10-03 Thread Serega Sheypak
I've read that in CDH release notes, but I did expect to get explicit value somewhere in configuration file. 2014-10-03 16:26 GMT+04:00 Jean-Marc Spaggiari : > Hi Serega, > > You don't see it because it's not modified. It just uses the default value. > > CDH 4.6 is hbase-0.94.15. So you have > >

Re: Get default split policy

2014-10-03 Thread Jean-Marc Spaggiari
Hi Serega, Defaults values for the split is in the code (as you can see in the other link I sent). So it might be a duplicate to define it again with the same value in the property file. Doable still, but I don't think it's a best practice. JM 2014-10-03 8:55 GMT-04:00 Serega Sheypak : > I've r

Re: Get default split policy

2014-10-03 Thread Serega Sheypak
Ok, I got it. The other consideration is that it's better to push up devault values to configuration. It makes configuration evident. Thank you for your help! 2014-10-03 17:37 GMT+04:00 Jean-Marc Spaggiari : > Hi Serega, > > Defaults values for the split is in the code (as you can see in the othe

Splits are not preserved during table copy using getAdmin().createTable(tableDescriptor);

2014-10-03 Thread Serega Sheypak
Hi, here is my code: public void dropIfExistsAndCreate(String sourceTableName, String newTableName) throws IOException { LOG.info(String.format("Use [%s] to create [%s]", sourceTableName, newTableName)); HTableDescriptor descriptor = getDescriptor(sourceTableName); dropIfEx

single column value filter to find rows greater than a certain date not working

2014-10-03 Thread aaaa342156
I am storing date in a column in Hbase table. I am using below syntax to find dates greater than 20110810 but the resultset is bringing back all the rows. Any thoughts. Filter filter = new SingleColumnValueFilter(Bytes.toBytes("cf"), Bytes.toBytes("d"), CompareOp.GREATER_OR_EQUAL, Bytes.toBytes("2

Re: single column value filter to find rows greater than a certain date not working

2014-10-03 Thread Ted Yu
Which release of hbase are you using ? Can you come up with unit test that shows this problem ? BTW Your use case is covered by TestSingleColumnValueFilter.java On Fri, Oct 3, 2014 at 4:35 AM, 342156 wrote: > I am storing date in a column in Hbase table. I am using below syntax to > find >

Re: Splits are not preserved during table copy using getAdmin().createTable(tableDescriptor);

2014-10-03 Thread Serega Sheypak
Ok, I found them: hTable.getRegionLocations().descendingKeySet() and HRegionInfo has startKey and endKey I have to prepare splits[][]. I'm confused a little, how keys should be placed there? splits[][] shoud by and array with length=region count and width = 2? 2014-10-03 17:57 GMT+04:00 Serega Sh

Re: Splits are not preserved during table copy using getAdmin().createTable(tableDescriptor);

2014-10-03 Thread Ted Yu
Take a look at this method in HTable: public Pair getStartEndKeys() throws IOException { You would see that the first dimension corresponds to the number of regions. Cheers On Fri, Oct 3, 2014 at 8:33 AM, Serega Sheypak wrote: > Ok, I found them: > hTable.getRegionLocations().descendingKey

Re: [HBase] are column qualifiers safe as user inputed values?

2014-10-03 Thread Joe Pepersack
Ted, I am not aware of anything within HBase itself that would be affected by malicious character strings, although there is a significant probability that an unpublicized vulnerability exists. However, you have to consider which API you are using as well: in the Thrift API the Key, Qualifi

Re: Splits are not preserved during table copy using getAdmin().createTable(tableDescriptor);

2014-10-03 Thread Serega Sheypak
Thanks, I'm already updated my end-to-end test, waiting for the result. Here is my code snippet, is it ok? @SneakyThrows(IOException.class) private byte[][] collectSplits(String tableName){ HTable table = new HTable(configuration, tableName); int splitSize = table.getRegionLo

Re: Splits are not preserved during table copy using getAdmin().createTable(tableDescriptor);

2014-10-03 Thread Ted Yu
You can simplify your code by utilizing the following from HTable: public byte [][] getStartKeys() throws IOException { No need to sort the keys. Cheers On Fri, Oct 3, 2014 at 9:10 AM, Serega Sheypak wrote: > Thanks, I'm already updated my end-to-end test, waiting for the result. > > Here i

Re: Splits are not preserved during table copy using getAdmin().createTable(tableDescriptor);

2014-10-03 Thread Serega Sheypak
So easy, thanks :) I've missed that method. 2014-10-03 20:23 GMT+04:00 Ted Yu : > You can simplify your code by utilizing the following from HTable: > > public byte [][] getStartKeys() throws IOException { > > No need to sort the keys. > > Cheers > > On Fri, Oct 3, 2014 at 9:10 AM, Serega Sheyp

Re: Splits are not preserved during table copy using getAdmin().createTable(tableDescriptor);

2014-10-03 Thread Serega Sheypak
Do I have to pass an array of startKeys or an array of endKeys? 2014-10-03 20:29 GMT+04:00 Serega Sheypak : > So easy, thanks :) I've missed that method. > > 2014-10-03 20:23 GMT+04:00 Ted Yu : > >> You can simplify your code by utilizing the following from HTable: >> >> public byte [][] getSta

Re: Splits are not preserved during table copy using getAdmin().createTable(tableDescriptor);

2014-10-03 Thread Ted Yu
startKeys contains an empty byte array at the front and endKeys contains empty byte array at the end. You can strip out the empty byte array from startKeys and pass to table creation API. Cheers On Fri, Oct 3, 2014 at 10:57 AM, Serega Sheypak wrote: > Do I have to pass an array of startKeys or

Re: Splits are not preserved during table copy using getAdmin().createTable(tableDescriptor);

2014-10-03 Thread Serega Sheypak
Ok, I got it. Really, it doesn't mean which key sequence to pass, right? The only requirement is to trim first or last key which is zero bytes length. 2014-10-03 22:08 GMT+04:00 Ted Yu : > startKeys contains an empty byte array at the front and endKeys contains > empty > byte array at the end. >

Re: Splits are not preserved during table copy using getAdmin().createTable(tableDescriptor);

2014-10-03 Thread Ted Yu
That's true. Cheers On Fri, Oct 3, 2014 at 11:16 AM, Serega Sheypak wrote: > Ok, I got it. Really, it doesn't mean which key sequence to pass, right? > The only requirement is to trim first or last key which is zero bytes > length. > > 2014-10-03 22:08 GMT+04:00 Ted Yu : > > > startKeys contain

Re: Splits are not preserved during table copy using getAdmin().createTable(tableDescriptor);

2014-10-03 Thread Serega Sheypak
Great! Thanks! 2014-10-03 22:18 GMT+04:00 Ted Yu : > That's true. > > Cheers > > On Fri, Oct 3, 2014 at 11:16 AM, Serega Sheypak > wrote: > > > Ok, I got it. Really, it doesn't mean which key sequence to pass, right? > > The only requirement is to trim first or last key which is zero bytes > > l

Re: single column value filter to find rows greater than a certain date not working

2014-10-03 Thread lars hofhansl
> Bytes.toBytes("20110810") Is that exactly how you are storing the dates? As string converted to bytes? Or did you store them as a long converted to bytes? Also note that this is a fairly inefficient way if doing this. If this is the typical access pattern you should put the data in the row ke

RE: HBase read performance

2014-10-03 Thread Khaled Elmeleegy
Lars, Ted, and Qiang, Thanks for all the input. Qiang: yes all the threads are in the same client process sharing the same connection. And since I don't see hardware contention, may be there is contention over this code path. I'll try using many connections and see if it alleviates the problems

Re: Put related Region server connection

2014-10-03 Thread Ted Yu
bq. recommendation to how many open connections per region server at a time? This depends on the number of handlers you configure per region server. bq. that means 1000 x 100 open connections at any given moment? If each of your processes talks to 100 region servers simultaneously. Cheers On T

Puts failing with WrongRegionException

2014-10-03 Thread Thomas Kwan
Hi there, Wonder if anyone has seen error like this 2014-10-03 16:03:45,203 WARN [RpcServer.handler=7,port=60020] regionserver.HRegion: Failed getting lock in batch put, row=65317d52abfedc8b94a19f6fbffe187c org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out of range for

Re: Puts failing with WrongRegionException

2014-10-03 Thread Ted Yu
Can you check region server log to see if region m_test, 64d7e88463b88e7325b623fbd6629cda,1408803862959. cb513be341b94588469efa9d26d29857. moved / splitted between MR job launch and the time when this error showed up ? Thanks On Fri, Oct 3, 2014 at 4:08 PM, Thomas Kwan wrote: > Hi there, > > Wo