Re: distributing new regions immediately
very interesting. Thanks Ted From: Ted YuSent: Thursday, July 27, 2017 2:13:25 PM To: user@hbase.apache.org Subject: Re: distributing new regions immediately Since you're more concerned with write load, you can take a look at the following parameter: hbase.master.balancer.stochastic.writeRequestCost Default value is 5, much smaller than default value for region count cost (500). Consider raising the value so that load balancer reacts more responsively. On Thu, Jul 27, 2017 at 12:17 PM, jeff saremi wrote: > We haven't done enough testing for me to say this with certainty but as we > insert data and new regions get created, it could be a while before those > regions are distributed. As such and if the data injection continues the > load on the region server becomes overwhelming > > Is there a way to expedite the distribution of regions among available > region servers? > > thanks > >
Re: distributing new regions immediately
Thanks Dima From: Dima SpivakSent: Thursday, July 27, 2017 12:38:56 PM To: user@hbase.apache.org Subject: Re: distributing new regions immediately Presplitting tables is typically how this is addressed in production cases. On Thu, Jul 27, 2017 at 12:17 PM jeff saremi wrote: > We haven't done enough testing for me to say this with certainty but as we > insert data and new regions get created, it could be a while before those > regions are distributed. As such and if the data injection continues the > load on the region server becomes overwhelming > > Is there a way to expedite the distribution of regions among available > region servers? > > thanks > > -- -Dima
Graph Analytics on HBase With HGraphDB and Apache Flink Gelly
For those who are interested, yet another blog on analyzing graphs stored in HBase, this time with Apache Flink Gelly: https://yokota.blog/2017/07/27/graph-analytics-on-hbase-with-hgraphdb-and-apache-flink-gelly/
Re: distributing new regions immediately
Since you're more concerned with write load, you can take a look at the following parameter: hbase.master.balancer.stochastic.writeRequestCost Default value is 5, much smaller than default value for region count cost (500). Consider raising the value so that load balancer reacts more responsively. On Thu, Jul 27, 2017 at 12:17 PM, jeff saremiwrote: > We haven't done enough testing for me to say this with certainty but as we > insert data and new regions get created, it could be a while before those > regions are distributed. As such and if the data injection continues the > load on the region server becomes overwhelming > > Is there a way to expedite the distribution of regions among available > region servers? > > thanks > >
Re: distributing new regions immediately
Presplitting tables is typically how this is addressed in production cases. On Thu, Jul 27, 2017 at 12:17 PM jeff saremiwrote: > We haven't done enough testing for me to say this with certainty but as we > insert data and new regions get created, it could be a while before those > regions are distributed. As such and if the data injection continues the > load on the region server becomes overwhelming > > Is there a way to expedite the distribution of regions among available > region servers? > > thanks > > -- -Dima
distributing new regions immediately
We haven't done enough testing for me to say this with certainty but as we insert data and new regions get created, it could be a while before those regions are distributed. As such and if the data injection continues the load on the region server becomes overwhelming Is there a way to expedite the distribution of regions among available region servers? thanks
Re: HBase GET operation max row size - partial results
AFAIK there's no max result size or partial result for get request. If we add such feature in future, we will add release note in JIRA. (Actually we have implemented such limit in our customized version and it requires CP to correctly handle it, we may upstream the feature later) Some more detailed information at code level: When saying "get uses scan internally", we mean it reuse the scan logic in HRegion class. But at the rpc service level in RSRpcServices, there're two different methods (get and scan) for these two kinds of requests, and currently you'll only find below max result size limit in scan code path: {code} long maxResultSize; if (scanner.getMaxResultSize() > 0) { maxResultSize = Math.min(scanner.getMaxResultSize(), maxQuotaResultSize); } else { maxResultSize = maxQuotaResultSize; } ... ScannerContext.Builder contextBuilder = ScannerContext.newBuilder( true); // maxResultSize - either we can reach this much size for all cells(being read) data or sum // of heap size occupied by cells(being read). Cell data means its key and value parts. contextBuilder.setSizeLimit(sizeScope, maxResultSize, maxResultSize ); ... ScannerContext scannerContext = contextBuilder.build(); while (numOfResults < maxResults) { ... moreRows = scanner.nextRaw(values, scannerContext); ... {code} Hope it helps. Best Regards, Yu On 27 July 2017 at 09:54, Anoop Johnwrote: > You mean within your RegionObserver you are doing the get? Within > which hook? What is the way you are doing the get? Can u paste > that sample code. > > -Anoop- > > On Wed, Jul 26, 2017 at 8:02 PM, Veerraju Tadimeti > wrote: > > Hi, > > > > If i use GET operation, is there any chance of getting partial result? If > > Yes, under what circumstances. Is there any way to reproduce it? > > > > I am using GET operation in my coProcessor ( to the same region), adding > > the resut to the List . I am afraid that any chance of partial > > result when using GET operation, since GET uses SCAN operation > internally. > > > > Thanks, > > Raju, > > (972)273-0155. >