What is meant by 8% quartile? 75th %-ile? 98%-ile? Should quartile have been quantile?
On Wed, Apr 20, 2011 at 12:00 PM, Dmitriy Lyubimov <dlie...@gmail.com>wrote: > Ok actually we do have 1 region for these exact tables... so back to > square one. > > FWIW i do get 8% quartile under 3ms TTLB. So it is algorithmically > sound it seems. question is why outliers spread is so much longer than > in tests on one machine. must be network. What else. > > > On Wed, Apr 20, 2011 at 10:06 AM, Dmitriy Lyubimov <dlie...@gmail.com> > wrote: > > Got it. This must be the reason. Cause it is a laugh check, and i do > > see 6 regions for 40 rows so it can span them, although i can't > > confirm it for sure. It may be due to how table was set up or due to > > some time running them and rotating some data there. The uniformly > > distributed hashes are used for the keys so that it is totally > > plausible 40 rows will get into 6 different regions. > > > > Ok i'll take it for working theory for now. > > > > Is there a way to set max # of regions per table? I guess the method > > in the manual is to set max region size. Which means i probably need > > to re-create the table with one region to get back to 1 region? or > > maybe there's a way to get it back to one region without recreating > > it, such as major compaction? > > > > thanks. > > -d > > > > On Wed, Apr 20, 2011 at 9:55 AM, Stack <st...@duboce.net> wrote: > >> On Wed, Apr 20, 2011 at 9:49 AM, Dmitriy Lyubimov <dlie...@gmail.com> > wrote: > >>> Ok. Let me ask a question. > >>> > >>> When scan is performed and it obviously covers several regions, are > >>> scan performance calls done in sinchronous succession or they are done > >>> in parallel? > >>> > >> > >> The former. > >> > >> > >>> Assuming scan is returning 40 results but for some weird reason it > >>> goes to 6 regions and caching is set to 100 (so it can take all of > >>> them) are individual region request latencies summed or it would be > >>> max(region request latency)? > >>> > >> > >> Summed. > >> > >> The 40 rows are not contiguous in the same region? If not, the cost > >> of client setting up new scanner against next region will be inline w/ > >> your read timing (at least an rpc per region). > >> > >> St.Ack > >> > >>> Thank you very much. > >>> -D > >>> > >>> On Tue, Apr 19, 2011 at 6:28 PM, Ted Dunning <tdunn...@maprtech.com> > wrote: > >>>> For a tiny test like this, everything should be in memory and latency > >>>> should be very low. > >>>> > >>>> On Tue, Apr 19, 2011 at 5:39 PM, Dmitriy Lyubimov <dlie...@gmail.com> > wrote: > >>>>> PS so what should latency be for reads in 0.90, assuming moderate > thruput? > >>>>> > >>>>> On Tue, Apr 19, 2011 at 5:39 PM, Dmitriy Lyubimov <dlie...@gmail.com> > wrote: > >>>>>> for this test, there's just no more than 40 rows in every given > table. > >>>>>> This is just a laugh check. > >>>>>> > >>>>>> so i think it's safe to assume it all goes to same region server. > >>>>>> > >>>>>> But latency would not depend on which server call is going to, would > >>>>>> it? Only throughput would, assuming we are not overloading. > >>>>>> > >>>>>> And we clearly are not as my single-node local version runs quite ok > >>>>>> response times with the same throughput. > >>>>>> > >>>>>> It's something with either client connections or network latency or > >>>>>> ... i don't know what it is. I did not set up the cluster but i > gotta > >>>>>> troubleshoot it now :) > >>>>>> > >>>>>> > >>>>>> > >>>>>> On Tue, Apr 19, 2011 at 5:23 PM, Ted Dunning <tdunn...@maprtech.com> > wrote: > >>>>>>> How many regions? How are they distributed? > >>>>>>> > >>>>>>> Typically it is good to fill the table some what and then drive > some > >>>>>>> splits and balance operations via the shell. One more split to > make > >>>>>>> the regions be local and you should be good to go. Make sure you > have > >>>>>>> enough keys in the table to support these splits, of course. > >>>>>>> > >>>>>>> Under load, you can look at the hbase home page to see how > >>>>>>> transactions are spread around your cluster. Without splits and > local > >>>>>>> region files, you aren't going to see what you want in terms of > >>>>>>> performance. > >>>>>>> > >>>>>> > >>>>> > >>>> > >>> > >> > > >