Re: DISCUSSION: lets do a developer workshop on near-term work

ramkrishna vasudevan Sat, 18 Jul 2015 11:15:23 -0700

Would definitely like to attend the meeting if it is suitable to IST
timing.
Performance numbers in 11425 are cluster testing done using the basic perf
test tools given by hbase.  We plan to test them using bigger data set
using tools like YCSB and may be we will specifically see what is the n/w
bandwidth impact on the client side.


Our initial tests were done on the server side basically to really see the
benefit of how much these changes help and its benefit.

We would definitely like to take up your concern. BTW, just want to say
that we are like 60 to 70% done. Some major JIRAs like HBASE-12295 is
pending and its in final stage. Apart from that we have already started
working on some minor sub-task which would make the entire flow working
with offheap.

Regards
Ram


On Sat, Jul 18, 2015 at 11:32 PM, Anoop John <[email protected]> wrote:

> No Andy. 11425 having doc attached to it. At the end of it, we have added
> perf numbers in a cluster testing.  This was done using PE get and scan
> tests with filtering all cells at server (to not consider n/w bandwidth
> constraints)
>
> -Anoop-
>
> On Sat, Jul 18, 2015 at 9:30 PM, Andrew Purtell <[email protected]>
> wrote:
>
> > We have some microbenchmarks, not evidence of differences seen from a
> > client application. I'm not saying that microbenchmarks are not totally
> > necessary and a great start - they are - but that they don't measure an
> end
> > goal. Furthermore unless I've missed one somewhere we don't have a JIRA
> or
> > design doc that states a clear end goal metric like the strawman I threw
> > together in my previous mail. A measurable system level goal and some
> data
> > from full cluster testing would go a lot further toward letting all of us
> > evaluate the potential and payoff of the work. In the meantime we should
> > probably be assembling these changes on a branch instead of in trunk, for
> > as long as the goal is not clearly defined and the payoff and potential
> for
> > perf regressions is untested and unknown.
> >
> >
> > > On Jul 18, 2015, at 8:05 AM, Anoop John <[email protected]> wrote:
> > >
> > > Thanks Andy and Lars.  The parent jira has doc attached which contains
> > some
> > > perf gain numbers..  We will be doing more tests in next 2 weeks
> (before
> > > end of this month) and will publish them.   Yes it will be great if it
> is
> > > more IST friendly time :-)
> > >
> > > -Anoop-
> > >
> > > On Fri, Jul 17, 2015 at 9:44 PM, Andrew Purtell <
> > [email protected]>
> > > wrote:
> > >
> > >>> I can represent your side Ram (and Anoop). I've been known always
> argue
> > >> both side of a discussion and to never take sides easily (drives some
> > folks
> > >> crazy).
> > >>
> > >> I can vouch for this (smile)
> > >>
> > >> I also can offer support for off heaping there. At the same time we do
> > >> have a gap where we can't point to a timeline of improvements (yet,
> > anyway)
> > >> with benchmarks showing gains where your goals need them. For example,
> > >> stock HBase in one JVM can address max N GB for response time
> > distribution
> > >> D; dev version of HBase in off heap branch can address max N' GB for
> > >> distribution D', where N' > N and D > D' (distribution D'
> statistically
> > >> shows better/lower response times).
> > >>
> > >>
> > >>
> > >>> On Jul 17, 2015, at 6:56 AM, lars hofhansl <[email protected]> wrote:
> > >>>
> > >>> I'm in favor of anything that improves performance (and preferably
> > >> doesn't set us back into a world that's worse than C due to the lack
> of
> > >> pointers in Java).Never said "I don't like it", it's just that I'm
> > perhaps
> > >> asking for more numbers and justification in weighing the pros and
> cons.
> > >>> I can represent your side Ram (and Anoop). I've been known always
> argue
> > >> both side of a discussion and to never take sides easily (drives some
> > folks
> > >> crazy). And Stack's there too, he yell at me where needed :)
> > >>>
> > >>> Perhaps we can do it a bit later in the evening so there is a
> fighting
> > >> chance that folks on IST can participate. I know that some of our
> folks
> > on
> > >> IST would love to participate in the backup discussion).
> > >>>
> > >>> Like Enis, I'm also happy to host. We're in Downtown SF. I'd just
> need
> > >> an approx. number of folks.
> > >>>
> > >>> -- Lars
> > >>>
> > >>>     From: ramkrishna vasudevan <[email protected]>
> > >>> To: "[email protected]" <[email protected]>; lars hofhansl <
> > >> [email protected]>
> > >>> Sent: Wednesday, July 15, 2015 10:10 AM
> > >>> Subject: Re: DISCUSSION: lets do a developer workshop on near-term
> work
> > >>>
> > >>> Hi
> > >>> What time will it be on August 26th?
> > >>> @LarsYa. I know that you are not generally in favour of this
> offheaping
> > >> stuff.  May be if we (from India) can attend this meeting remotely
> your
> > >> thoughts can be discussed and also the current state of this work.
> > >>> RegardsRam
> > >>>
> > >>>
> > >>> On Wed, Jul 15, 2015 at 9:28 PM, lars hofhansl <[email protected]>
> > wrote:
> > >>>
> > >>> Works for me. I'll be back in the Bay Area the week of August 9th.
> > >>> We have done a _lot_ of work on backups as well - ours are more
> > >> complicated as we wanted fast per-tenant restores, so data is
> "grouped"
> > by
> > >> tenant. Would like to sync up on that (hopefully some of the folks who
> > >> wrote most of the code will be in town, I'll check).
> > >>>
> > >>> Also interested in the "Time" and "offheap" parts (although you folks
> > >> usually do not like what I think about the offheap efforts :) ).
> > >>> Would like to add the following topics:
> > >>>
> > >>>
> > >>> - "Timestamp Resolution". Or making space for more bits in the
> > >> timestamps (happy to cover that, unless it's part of the "Time" topic)
> > >>>
> > >>>
> > >>> - "Replication". We found that replication cannot keep up with high
> > >> write loads, due to the fact that replicated is strictly single
> threaded
> > >> per regionserver (even though we have multiple region servers on the
> > sink
> > >> side)
> > >>>
> > >>>
> > >>> - "Spark integration" (Ted Malaska?)
> > >>>
> > >>>
> > >>> OK... Out now to make a "bullshit hat".
> > >>>
> > >>> -- Lars
> > >>>
> > >>> ________________________________
> > >>> From: Sean Busbey <[email protected]>
> > >>> To: dev <[email protected]>
> > >>> Sent: Tuesday, July 14, 2015 7:11 PM
> > >>> Subject: Re: DISCUSSION: lets do a developer workshop on near-term
> work
> > >>>
> > >>>
> > >>> I'm planning to be in the Bay area the week of the 24th of August.
> > >>>
> > >>> --
> > >>> Sean
> > >>>
> > >>>
> > >>>
> > >>>> On Jul 14, 2015 7:53 PM, "Andrew Purtell" <[email protected]>
> > wrote:
> > >>>>
> > >>>> I can be up in your area in August.
> > >>>>
> > >>>>>> On Tue, Jul 14, 2015 at 5:31 PM, Stack <[email protected]> wrote:
> > >>>>>>
> > >>>>>> On Tue, Jul 14, 2015 at 3:39 PM, Enis Söztutar <
> [email protected]>
> > >>>>> wrote:
> > >>>>>
> > >>>>>> Sounds good. It has been a while we did the talk-aton.
> > >>>>>>
> > >>>>>> I'll be off starting 25 of July, so I prefer something next week
> if
> > >>>>>> possible.
> > >>>>>>
> > >>>>>> You ever coming back? If so, when? I'm back on 10th of August
> > (Mikhail
> > >>>> on
> > >>>>> the 20th).
> > >>>>> St.Ack
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>> Enis
> > >>>>>>
> > >>>>>>> On Tue, Jul 14, 2015 at 3:18 PM, Stack <[email protected]> wrote:
> > >>>>>>>
> > >>>>>>> Matteo and I were thinking it time devs got together for a
> pow-wow.
> > >>>>> There
> > >>>>>>> is a bunch of stuff in flight at the moment (see below list) and
> it
> > >>>>> would
> > >>>>>>> be good to meet and whiteboard, surface goodo ideas that have
> gone
> > >>>>>> dormant
> > >>>>>>> in JIRA, or revisit designs/proposals out in JIRA-attached google
> > doc
> > >>>>>> that
> > >>>>>>> need socializing.
> > >>>>>>>
> > >>>>>>> You can only come if you are wearing your bullshit hat.
> > >>>>>>>
> > >>>>>>> Topics we'd go over could include:
> > >>>>>>>
> > >>>>>>> + Our filesystem layout will not work if 1M regions
> (Matteo/Stack)
> > >>>>>>> + Current state of the offheaping of read path and alternate
> > KeyValue
> > >>>>>>> implementation (Anoop/Ram)
> > >>>>>>> + Append rejigger (Elliott)
> > >>>>>>> + A Pv2-based Assign (Matteo/Steven)
> > >>>>>>> + Splitting meta/1M regions
> > >>>>>>> + The revived Backup (Vladimir)
> > >>>>>>> + Time (Enis)
> > >>>>>>> + The overloaded SequenceId (Stack)
> > >>>>>>> + Upstreaming IT testing (Dima/Sean)
> > >>>>>>> + hbase-2.0.0
> > >>>>>>>
> > >>>>>>> I put names by folks I know could talk to the topic. If you want
> to
> > >>>>> take
> > >>>>>>> over a topic or put your name by one, just say.  Suggest that
> > >>>>> discussion
> > >>>>>>> lead off with a 5-10minute on current state of
> > >>>>>>> thought/design/implementation.
> > >>>>>>>
> > >>>>>>> What do others think?
> > >>>>>>>
> > >>>>>>> What date would suit folks?
> > >>>>>>>
> > >>>>>>> Anyone want to host?
> > >>>>>>>
> > >>>>>>> Thanks,
> > >>>>>>> Matteo and St.Ack
> > >>>>
> > >>>>
> > >>>>
> > >>>> --
> > >>>> Best regards,
> > >>>>
> > >>>>    - Andy
> > >>>>
> > >>>> Problems worthy of attack prove their worth by hitting back. - Piet
> > Hein
> > >>>> (via Tom White)
> > >>
> >
>

Re: DISCUSSION: lets do a developer workshop on near-term work

Reply via email to