Agreed, too many fat topics, but all important. I guess we can spend first 10-20 mins on the agenda based on who is in the room and come up with a shorter list and go from there.
Enis On Tue, Aug 11, 2015 at 9:23 PM, Stack <st...@duboce.net> wrote: > On Mon, Jul 20, 2015 at 1:04 PM, Stephen Jiang <syuanjiang...@gmail.com> > wrote: > > > [Let us move back to the main topic - a meeting to talk about the next > > direction on HBASE development] > > > > Are we firm on the *August 26th* meeting date? > > > > Given the long list of topics from St.Ack, even a one day meeting might > > not cover all of them (in depth). We need to either trim the topic list > or > > limit the time to discuss a single topic (30 min for one topic enough?). > > > > > Thanks for bringing us back to topic Stephen. > > Yes, lets do 26th. Speak up if this does not suit. I will file a meetup > page in an hour or so. Where should we do it? Enis offered his nice place. > Could try and get space at ours too... in Palo Alto (less 'deep south', a > little easier for the SFers). > > As to too many topics, in my experience, a bunch of smelly engineers all in > a room starts to fall apart after a couple of hours especially when ranging > discussion. Suggest we cut the time-per-topic and list of topics so can do > in an afternoon. If some topics are too fat, can do break out or put-off to > another day and smaller, interested group. > > St.Ack > > > > > > Thanks > > Stephen > > > > > > On Mon, Jul 20, 2015 at 9:50 AM, Anoop John <anoop.hb...@gmail.com> > wrote: > > > >> We will be doing some more large data tests in coming week Andy.. Will > >> report back more. Also will do a write up , in what all ways the work > >> might help us. As Sean said, we will continue in another thread if any > >> thing further.. Will soon write back on the test result. Thanks. > >> > >> -Anoop- > >> > >> On Mon, Jul 20, 2015 at 9:59 PM, Andrew Purtell < > andrew.purt...@gmail.com > >> > > >> wrote: > >> > >> > Cool, thanks. > >> > > >> > Is a 20% latency reduction the most we can expect or do you think > there > >> is > >> > room for more improvement? Just curious. > >> > > >> > Is latency reduction the only goal? Anything here about supporting > >> larger > >> > heaps? Is there something we can measure in that regard? > >> > > >> > Hope you see my point and there's enough here to prime a goals and > >> metrics > >> > discussion at the pow wow or on the relevant JIRAs. > >> > > >> > > On Jul 20, 2015, at 4:43 AM, ramkrishna vasudevan < > >> > ramkrishna.s.vasude...@gmail.com> wrote: > >> > > > >> > > Hi Andy > >> > > > >> > > Based on our POCs done, we expect around 20% improvement in latency. > >> For > >> > > scans it will be little lesser than 20%. > >> > > > >> > > Regards > >> > > Ram > >> > > > >> > > > >> > > On Sun, Jul 19, 2015 at 10:20 AM, Andrew Purtell < > >> > andrew.purt...@gmail.com> > >> > > wrote: > >> > > > >> > >> Hi Ram, > >> > >> > >> > >> Do you have any targets for what you are measuring? What are the > >> goals > >> > you > >> > >> guys are working toward with the off heaping changes? > >> > >> > >> > >> > >> > >>>> On Jul 18, 2015, at 9:16 PM, ramkrishna vasudevan < > >> > >>> ramkrishna.s.vasude...@gmail.com> wrote: > >> > >>> > >> > >>> Thanks Vladimir. > >> > >>> Yeah, the reports that were attached specifically captured the > >> 95/99th > >> > >>> percentile. > >> > >>> The reason for checking the server side perf was to specifically > see > >> > the > >> > >>> improvement in the server side and also the client was sending > large > >> > >>> results in multiple threads. So wanted to avoid the n/w > >> interference. I > >> > >>> think it was a general practice that we were following. > >> > >>> We Wil do some more tests and get some latest readings with bigger > >> data > >> > >>> sets. > >> > >>> Sent from mobile. > >> > >>>> On Jul 19, 2015 1:05 AM, "Andrew Purtell" < > >> andrew.purt...@gmail.com> > >> > >> wrote: > >> > >>>> > >> > >>>> +1 > >> > >>>> > >> > >>>> Yeah, something like that, with aspirational targets for > >> improvement > >> > >> from > >> > >>>> current releases. Then what to measure, the tests to run, and > >> criteria > >> > >> for > >> > >>>> evaluation are clear and organized and we're able to better > assess > >> how > >> > >> the > >> > >>>> work in progress is meeting its goals (or not) > >> > >>>> > >> > >>>> > >> > >>>> > >> > >>>> On Jul 18, 2015, at 12:05 PM, Vladimir Rodionov < > >> > vladrodio...@gmail.com > >> > >>> > >> > >>>> wrote: > >> > >>>> > >> > >>>>>>> Umbrella jira to make sure we can have blocks cached in > offheap > >> > >> backed > >> > >>>>> cache. In the entire read path, we can refer to this offheap > >> buffer > >> > and > >> > >>>>> avoid onheap copying. > >> > >>>>> > >> > >>>>> I think, on a read path, the most important improvement we could > >> > >> imagine > >> > >>>> is > >> > >>>>> elimination or reducing of object creations (KVs, iterators > etc). > >> > >>>>> object reuse, byte buffers reuse or offheap buffers reuse, API > >> change > >> > >>>> etc. > >> > >>>>> If this is a part of this JIRA, then I would easily define a > goal: > >> > >>>>> improving 95/99% latency of a read operations. Not performance, > >> but > >> > >>>> latency > >> > >>>>> matters > >> > >>>>> > >> > >>>>> -Vlad > >> > >>>>> > >> > >>>>> > >> > >>>>> > >> > >>>>> On Sat, Jul 18, 2015 at 11:24 AM, Andrew Purtell < > >> > >>>> andrew.purt...@gmail.com> > >> > >>>>> wrote: > >> > >>>>> > >> > >>>>>> That's not a realistic or useful test scenario, unless the goal > >> is > >> > to > >> > >>>>>> accelerate queries where all cells are filtered at the server. > >> > >>>>>> > >> > >>>>>> > >> > >>>>>> > >> > >>>>>>> On Jul 18, 2015, at 11:02 AM, Anoop John < > anoop.hb...@gmail.com > >> > > >> > >>>> wrote: > >> > >>>>>>> > >> > >>>>>>> No Andy. 11425 having doc attached to it. At the end of it, we > >> have > >> > >>>> added > >> > >>>>>>> perf numbers in a cluster testing. This was done using PE get > >> and > >> > >> scan > >> > >>>>>>> tests with filtering all cells at server (to not consider n/w > >> > >> bandwidth > >> > >>>>>>> constraints) > >> > >>>>>>> > >> > >>>>>>> -Anoop- > >> > >>>>>>> > >> > >>>>>>> On Sat, Jul 18, 2015 at 9:30 PM, Andrew Purtell < > >> > >>>>>> andrew.purt...@gmail.com> > >> > >>>>>>> wrote: > >> > >>>>>>> > >> > >>>>>>>> We have some microbenchmarks, not evidence of differences > seen > >> > from > >> > >> a > >> > >>>>>>>> client application. I'm not saying that microbenchmarks are > not > >> > >>>> totally > >> > >>>>>>>> necessary and a great start - they are - but that they don't > >> > measure > >> > >>>> an > >> > >>>>>> end > >> > >>>>>>>> goal. Furthermore unless I've missed one somewhere we don't > >> have a > >> > >>>> JIRA > >> > >>>>>> or > >> > >>>>>>>> design doc that states a clear end goal metric like the > >> strawman I > >> > >>>> threw > >> > >>>>>>>> together in my previous mail. A measurable system level goal > >> and > >> > >> some > >> > >>>>>> data > >> > >>>>>>>> from full cluster testing would go a lot further toward > letting > >> > all > >> > >> of > >> > >>>>>> us > >> > >>>>>>>> evaluate the potential and payoff of the work. In the > meantime > >> we > >> > >>>> should > >> > >>>>>>>> probably be assembling these changes on a branch instead of > in > >> > >> trunk, > >> > >>>>>> for > >> > >>>>>>>> as long as the goal is not clearly defined and the payoff and > >> > >>>> potential > >> > >>>>>> for > >> > >>>>>>>> perf regressions is untested and unknown. > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>>> On Jul 18, 2015, at 8:05 AM, Anoop John < > >> anoop.hb...@gmail.com> > >> > >>>> wrote: > >> > >>>>>>>>> > >> > >>>>>>>>> Thanks Andy and Lars. The parent jira has doc attached > which > >> > >>>> contains > >> > >>>>>>>> some > >> > >>>>>>>>> perf gain numbers.. We will be doing more tests in next 2 > >> weeks > >> > >>>>>> (before > >> > >>>>>>>>> end of this month) and will publish them. Yes it will be > >> great > >> > if > >> > >>>> it > >> > >>>>>> is > >> > >>>>>>>>> more IST friendly time :-) > >> > >>>>>>>>> > >> > >>>>>>>>> -Anoop- > >> > >>>>>>>>> > >> > >>>>>>>>> On Fri, Jul 17, 2015 at 9:44 PM, Andrew Purtell < > >> > >>>>>>>> andrew.purt...@gmail.com> > >> > >>>>>>>>> wrote: > >> > >>>>>>>>> > >> > >>>>>>>>>>> I can represent your side Ram (and Anoop). I've been known > >> > always > >> > >>>>>> argue > >> > >>>>>>>>>> both side of a discussion and to never take sides easily > >> (drives > >> > >>>> some > >> > >>>>>>>> folks > >> > >>>>>>>>>> crazy). > >> > >>>>>>>>>> > >> > >>>>>>>>>> I can vouch for this (smile) > >> > >>>>>>>>>> > >> > >>>>>>>>>> I also can offer support for off heaping there. At the same > >> time > >> > >> we > >> > >>>> do > >> > >>>>>>>>>> have a gap where we can't point to a timeline of > improvements > >> > >> (yet, > >> > >>>>>>>> anyway) > >> > >>>>>>>>>> with benchmarks showing gains where your goals need them. > For > >> > >>>> example, > >> > >>>>>>>>>> stock HBase in one JVM can address max N GB for response > time > >> > >>>>>>>> distribution > >> > >>>>>>>>>> D; dev version of HBase in off heap branch can address max > >> N' GB > >> > >> for > >> > >>>>>>>>>> distribution D', where N' > N and D > D' (distribution D' > >> > >>>>>> statistically > >> > >>>>>>>>>> shows better/lower response times). > >> > >>>>>>>>>> > >> > >>>>>>>>>> > >> > >>>>>>>>>> > >> > >>>>>>>>>>> On Jul 17, 2015, at 6:56 AM, lars hofhansl < > >> la...@apache.org> > >> > >>>> wrote: > >> > >>>>>>>>>>> > >> > >>>>>>>>>>> I'm in favor of anything that improves performance (and > >> > >> preferably > >> > >>>>>>>>>> doesn't set us back into a world that's worse than C due to > >> the > >> > >> lack > >> > >>>>>> of > >> > >>>>>>>>>> pointers in Java).Never said "I don't like it", it's just > >> that > >> > I'm > >> > >>>>>>>> perhaps > >> > >>>>>>>>>> asking for more numbers and justification in weighing the > >> pros > >> > and > >> > >>>>>> cons. > >> > >>>>>>>>>>> I can represent your side Ram (and Anoop). I've been known > >> > always > >> > >>>>>> argue > >> > >>>>>>>>>> both side of a discussion and to never take sides easily > >> (drives > >> > >>>> some > >> > >>>>>>>> folks > >> > >>>>>>>>>> crazy). And Stack's there too, he yell at me where needed > :) > >> > >>>>>>>>>>> > >> > >>>>>>>>>>> Perhaps we can do it a bit later in the evening so there > is > >> a > >> > >>>>>> fighting > >> > >>>>>>>>>> chance that folks on IST can participate. I know that some > of > >> > our > >> > >>>>>> folks > >> > >>>>>>>> on > >> > >>>>>>>>>> IST would love to participate in the backup discussion). > >> > >>>>>>>>>>> > >> > >>>>>>>>>>> Like Enis, I'm also happy to host. We're in Downtown SF. > I'd > >> > just > >> > >>>>>> need > >> > >>>>>>>>>> an approx. number of folks. > >> > >>>>>>>>>>> > >> > >>>>>>>>>>> -- Lars > >> > >>>>>>>>>>> > >> > >>>>>>>>>>> From: ramkrishna vasudevan < > >> ramkrishna.s.vasude...@gmail.com> > >> > >>>>>>>>>>> To: "dev@hbase.apache.org" <dev@hbase.apache.org>; lars > >> > >> hofhansl < > >> > >>>>>>>>>> la...@apache.org> > >> > >>>>>>>>>>> Sent: Wednesday, July 15, 2015 10:10 AM > >> > >>>>>>>>>>> Subject: Re: DISCUSSION: lets do a developer workshop on > >> > >> near-term > >> > >>>>>> work > >> > >>>>>>>>>>> > >> > >>>>>>>>>>> Hi > >> > >>>>>>>>>>> What time will it be on August 26th? > >> > >>>>>>>>>>> @LarsYa. I know that you are not generally in favour of > this > >> > >>>>>> offheaping > >> > >>>>>>>>>> stuff. May be if we (from India) can attend this meeting > >> > remotely > >> > >>>>>> your > >> > >>>>>>>>>> thoughts can be discussed and also the current state of > this > >> > work. > >> > >>>>>>>>>>> RegardsRam > >> > >>>>>>>>>>> > >> > >>>>>>>>>>> > >> > >>>>>>>>>>> On Wed, Jul 15, 2015 at 9:28 PM, lars hofhansl < > >> > la...@apache.org > >> > >>> > >> > >>>>>>>> wrote: > >> > >>>>>>>>>>> > >> > >>>>>>>>>>> Works for me. I'll be back in the Bay Area the week of > >> August > >> > >> 9th. > >> > >>>>>>>>>>> We have done a _lot_ of work on backups as well - ours are > >> more > >> > >>>>>>>>>> complicated as we wanted fast per-tenant restores, so data > is > >> > >>>>>> "grouped" > >> > >>>>>>>> by > >> > >>>>>>>>>> tenant. Would like to sync up on that (hopefully some of > the > >> > folks > >> > >>>> who > >> > >>>>>>>>>> wrote most of the code will be in town, I'll check). > >> > >>>>>>>>>>> > >> > >>>>>>>>>>> Also interested in the "Time" and "offheap" parts > (although > >> you > >> > >>>> folks > >> > >>>>>>>>>> usually do not like what I think about the offheap efforts > >> :) ). > >> > >>>>>>>>>>> Would like to add the following topics: > >> > >>>>>>>>>>> > >> > >>>>>>>>>>> > >> > >>>>>>>>>>> - "Timestamp Resolution". Or making space for more bits in > >> the > >> > >>>>>>>>>> timestamps (happy to cover that, unless it's part of the > >> "Time" > >> > >>>> topic) > >> > >>>>>>>>>>> > >> > >>>>>>>>>>> > >> > >>>>>>>>>>> - "Replication". We found that replication cannot keep up > >> with > >> > >> high > >> > >>>>>>>>>> write loads, due to the fact that replicated is strictly > >> single > >> > >>>>>> threaded > >> > >>>>>>>>>> per regionserver (even though we have multiple region > >> servers on > >> > >> the > >> > >>>>>>>> sink > >> > >>>>>>>>>> side) > >> > >>>>>>>>>>> > >> > >>>>>>>>>>> > >> > >>>>>>>>>>> - "Spark integration" (Ted Malaska?) > >> > >>>>>>>>>>> > >> > >>>>>>>>>>> > >> > >>>>>>>>>>> OK... Out now to make a "bullshit hat". > >> > >>>>>>>>>>> > >> > >>>>>>>>>>> -- Lars > >> > >>>>>>>>>>> > >> > >>>>>>>>>>> ________________________________ > >> > >>>>>>>>>>> From: Sean Busbey <bus...@cloudera.com> > >> > >>>>>>>>>>> To: dev <dev@hbase.apache.org> > >> > >>>>>>>>>>> Sent: Tuesday, July 14, 2015 7:11 PM > >> > >>>>>>>>>>> Subject: Re: DISCUSSION: lets do a developer workshop on > >> > >> near-term > >> > >>>>>> work > >> > >>>>>>>>>>> > >> > >>>>>>>>>>> > >> > >>>>>>>>>>> I'm planning to be in the Bay area the week of the 24th of > >> > >> August. > >> > >>>>>>>>>>> > >> > >>>>>>>>>>> -- > >> > >>>>>>>>>>> Sean > >> > >>>>>>>>>>> > >> > >>>>>>>>>>> > >> > >>>>>>>>>>> > >> > >>>>>>>>>>>> On Jul 14, 2015 7:53 PM, "Andrew Purtell" < > >> > apurt...@apache.org> > >> > >>>>>>>> wrote: > >> > >>>>>>>>>>>> > >> > >>>>>>>>>>>> I can be up in your area in August. > >> > >>>>>>>>>>>> > >> > >>>>>>>>>>>>>> On Tue, Jul 14, 2015 at 5:31 PM, Stack < > st...@duboce.net > >> > > >> > >>>> wrote: > >> > >>>>>>>>>>>>>> > >> > >>>>>>>>>>>>>> On Tue, Jul 14, 2015 at 3:39 PM, Enis Söztutar < > >> > >>>>>> enis....@gmail.com> > >> > >>>>>>>>>>>>> wrote: > >> > >>>>>>>>>>>>> > >> > >>>>>>>>>>>>>> Sounds good. It has been a while we did the talk-aton. > >> > >>>>>>>>>>>>>> > >> > >>>>>>>>>>>>>> I'll be off starting 25 of July, so I prefer something > >> next > >> > >> week > >> > >>>>>> if > >> > >>>>>>>>>>>>>> possible. > >> > >>>>>>>>>>>>>> > >> > >>>>>>>>>>>>>> You ever coming back? If so, when? I'm back on 10th of > >> > August > >> > >>>>>>>> (Mikhail > >> > >>>>>>>>>>>> on > >> > >>>>>>>>>>>>> the 20th). > >> > >>>>>>>>>>>>> St.Ack > >> > >>>>>>>>>>>>> > >> > >>>>>>>>>>>>> > >> > >>>>>>>>>>>>> > >> > >>>>>>>>>>>>> > >> > >>>>>>>>>>>>>> Enis > >> > >>>>>>>>>>>>>> > >> > >>>>>>>>>>>>>>> On Tue, Jul 14, 2015 at 3:18 PM, Stack < > >> st...@duboce.net> > >> > >>>> wrote: > >> > >>>>>>>>>>>>>>> > >> > >>>>>>>>>>>>>>> Matteo and I were thinking it time devs got together > >> for a > >> > >>>>>> pow-wow. > >> > >>>>>>>>>>>>> There > >> > >>>>>>>>>>>>>>> is a bunch of stuff in flight at the moment (see below > >> > list) > >> > >>>> and > >> > >>>>>> it > >> > >>>>>>>>>>>>> would > >> > >>>>>>>>>>>>>>> be good to meet and whiteboard, surface goodo ideas > that > >> > have > >> > >>>>>> gone > >> > >>>>>>>>>>>>>> dormant > >> > >>>>>>>>>>>>>>> in JIRA, or revisit designs/proposals out in > >> JIRA-attached > >> > >>>> google > >> > >>>>>>>> doc > >> > >>>>>>>>>>>>>> that > >> > >>>>>>>>>>>>>>> need socializing. > >> > >>>>>>>>>>>>>>> > >> > >>>>>>>>>>>>>>> You can only come if you are wearing your bullshit > hat. > >> > >>>>>>>>>>>>>>> > >> > >>>>>>>>>>>>>>> Topics we'd go over could include: > >> > >>>>>>>>>>>>>>> > >> > >>>>>>>>>>>>>>> + Our filesystem layout will not work if 1M regions > >> > >>>>>> (Matteo/Stack) > >> > >>>>>>>>>>>>>>> + Current state of the offheaping of read path and > >> > alternate > >> > >>>>>>>> KeyValue > >> > >>>>>>>>>>>>>>> implementation (Anoop/Ram) > >> > >>>>>>>>>>>>>>> + Append rejigger (Elliott) > >> > >>>>>>>>>>>>>>> + A Pv2-based Assign (Matteo/Steven) > >> > >>>>>>>>>>>>>>> + Splitting meta/1M regions > >> > >>>>>>>>>>>>>>> + The revived Backup (Vladimir) > >> > >>>>>>>>>>>>>>> + Time (Enis) > >> > >>>>>>>>>>>>>>> + The overloaded SequenceId (Stack) > >> > >>>>>>>>>>>>>>> + Upstreaming IT testing (Dima/Sean) > >> > >>>>>>>>>>>>>>> + hbase-2.0.0 > >> > >>>>>>>>>>>>>>> > >> > >>>>>>>>>>>>>>> I put names by folks I know could talk to the topic. > If > >> you > >> > >>>> want > >> > >>>>>> to > >> > >>>>>>>>>>>>> take > >> > >>>>>>>>>>>>>>> over a topic or put your name by one, just say. > Suggest > >> > that > >> > >>>>>>>>>>>>> discussion > >> > >>>>>>>>>>>>>>> lead off with a 5-10minute on current state of > >> > >>>>>>>>>>>>>>> thought/design/implementation. > >> > >>>>>>>>>>>>>>> > >> > >>>>>>>>>>>>>>> What do others think? > >> > >>>>>>>>>>>>>>> > >> > >>>>>>>>>>>>>>> What date would suit folks? > >> > >>>>>>>>>>>>>>> > >> > >>>>>>>>>>>>>>> Anyone want to host? > >> > >>>>>>>>>>>>>>> > >> > >>>>>>>>>>>>>>> Thanks, > >> > >>>>>>>>>>>>>>> Matteo and St.Ack > >> > >>>>>>>>>>>> > >> > >>>>>>>>>>>> > >> > >>>>>>>>>>>> > >> > >>>>>>>>>>>> -- > >> > >>>>>>>>>>>> Best regards, > >> > >>>>>>>>>>>> > >> > >>>>>>>>>>>> - Andy > >> > >>>>>>>>>>>> > >> > >>>>>>>>>>>> Problems worthy of attack prove their worth by hitting > >> back. - > >> > >>>> Piet > >> > >>>>>>>> Hein > >> > >>>>>>>>>>>> (via Tom White) > >> > >> > >> > > >> > > > > >