We have some microbenchmarks, not evidence of differences seen from a client 
application. I'm not saying that microbenchmarks are not totally necessary and 
a great start - they are - but that they don't measure an end goal. Furthermore 
unless I've missed one somewhere we don't have a JIRA or design doc that states 
a clear end goal metric like the strawman I threw together in my previous mail. 
A measurable system level goal and some data from full cluster testing would go 
a lot further toward letting all of us evaluate the potential and payoff of the 
work. In the meantime we should probably be assembling these changes on a 
branch instead of in trunk, for as long as the goal is not clearly defined and 
the payoff and potential for perf regressions is untested and unknown. 


> On Jul 18, 2015, at 8:05 AM, Anoop John <[email protected]> wrote:
> 
> Thanks Andy and Lars.  The parent jira has doc attached which contains some
> perf gain numbers..  We will be doing more tests in next 2 weeks (before
> end of this month) and will publish them.   Yes it will be great if it is
> more IST friendly time :-)
> 
> -Anoop-
> 
> On Fri, Jul 17, 2015 at 9:44 PM, Andrew Purtell <[email protected]>
> wrote:
> 
>>> I can represent your side Ram (and Anoop). I've been known always argue
>> both side of a discussion and to never take sides easily (drives some folks
>> crazy).
>> 
>> I can vouch for this (smile)
>> 
>> I also can offer support for off heaping there. At the same time we do
>> have a gap where we can't point to a timeline of improvements (yet, anyway)
>> with benchmarks showing gains where your goals need them. For example,
>> stock HBase in one JVM can address max N GB for response time distribution
>> D; dev version of HBase in off heap branch can address max N' GB for
>> distribution D', where N' > N and D > D' (distribution D' statistically
>> shows better/lower response times).
>> 
>> 
>> 
>>> On Jul 17, 2015, at 6:56 AM, lars hofhansl <[email protected]> wrote:
>>> 
>>> I'm in favor of anything that improves performance (and preferably
>> doesn't set us back into a world that's worse than C due to the lack of
>> pointers in Java).Never said "I don't like it", it's just that I'm perhaps
>> asking for more numbers and justification in weighing the pros and cons.
>>> I can represent your side Ram (and Anoop). I've been known always argue
>> both side of a discussion and to never take sides easily (drives some folks
>> crazy). And Stack's there too, he yell at me where needed :)
>>> 
>>> Perhaps we can do it a bit later in the evening so there is a fighting
>> chance that folks on IST can participate. I know that some of our folks on
>> IST would love to participate in the backup discussion).
>>> 
>>> Like Enis, I'm also happy to host. We're in Downtown SF. I'd just need
>> an approx. number of folks.
>>> 
>>> -- Lars
>>> 
>>>     From: ramkrishna vasudevan <[email protected]>
>>> To: "[email protected]" <[email protected]>; lars hofhansl <
>> [email protected]>
>>> Sent: Wednesday, July 15, 2015 10:10 AM
>>> Subject: Re: DISCUSSION: lets do a developer workshop on near-term work
>>> 
>>> Hi
>>> What time will it be on August 26th?
>>> @LarsYa. I know that you are not generally in favour of this offheaping
>> stuff.  May be if we (from India) can attend this meeting remotely your
>> thoughts can be discussed and also the current state of this work.
>>> RegardsRam
>>> 
>>> 
>>> On Wed, Jul 15, 2015 at 9:28 PM, lars hofhansl <[email protected]> wrote:
>>> 
>>> Works for me. I'll be back in the Bay Area the week of August 9th.
>>> We have done a _lot_ of work on backups as well - ours are more
>> complicated as we wanted fast per-tenant restores, so data is "grouped" by
>> tenant. Would like to sync up on that (hopefully some of the folks who
>> wrote most of the code will be in town, I'll check).
>>> 
>>> Also interested in the "Time" and "offheap" parts (although you folks
>> usually do not like what I think about the offheap efforts :) ).
>>> Would like to add the following topics:
>>> 
>>> 
>>> - "Timestamp Resolution". Or making space for more bits in the
>> timestamps (happy to cover that, unless it's part of the "Time" topic)
>>> 
>>> 
>>> - "Replication". We found that replication cannot keep up with high
>> write loads, due to the fact that replicated is strictly single threaded
>> per regionserver (even though we have multiple region servers on the sink
>> side)
>>> 
>>> 
>>> - "Spark integration" (Ted Malaska?)
>>> 
>>> 
>>> OK... Out now to make a "bullshit hat".
>>> 
>>> -- Lars
>>> 
>>> ________________________________
>>> From: Sean Busbey <[email protected]>
>>> To: dev <[email protected]>
>>> Sent: Tuesday, July 14, 2015 7:11 PM
>>> Subject: Re: DISCUSSION: lets do a developer workshop on near-term work
>>> 
>>> 
>>> I'm planning to be in the Bay area the week of the 24th of August.
>>> 
>>> --
>>> Sean
>>> 
>>> 
>>> 
>>>> On Jul 14, 2015 7:53 PM, "Andrew Purtell" <[email protected]> wrote:
>>>> 
>>>> I can be up in your area in August.
>>>> 
>>>>>> On Tue, Jul 14, 2015 at 5:31 PM, Stack <[email protected]> wrote:
>>>>>> 
>>>>>> On Tue, Jul 14, 2015 at 3:39 PM, Enis Söztutar <[email protected]>
>>>>> wrote:
>>>>> 
>>>>>> Sounds good. It has been a while we did the talk-aton.
>>>>>> 
>>>>>> I'll be off starting 25 of July, so I prefer something next week if
>>>>>> possible.
>>>>>> 
>>>>>> You ever coming back? If so, when? I'm back on 10th of August (Mikhail
>>>> on
>>>>> the 20th).
>>>>> St.Ack
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>>> Enis
>>>>>> 
>>>>>>> On Tue, Jul 14, 2015 at 3:18 PM, Stack <[email protected]> wrote:
>>>>>>> 
>>>>>>> Matteo and I were thinking it time devs got together for a pow-wow.
>>>>> There
>>>>>>> is a bunch of stuff in flight at the moment (see below list) and it
>>>>> would
>>>>>>> be good to meet and whiteboard, surface goodo ideas that have gone
>>>>>> dormant
>>>>>>> in JIRA, or revisit designs/proposals out in JIRA-attached google doc
>>>>>> that
>>>>>>> need socializing.
>>>>>>> 
>>>>>>> You can only come if you are wearing your bullshit hat.
>>>>>>> 
>>>>>>> Topics we'd go over could include:
>>>>>>> 
>>>>>>> + Our filesystem layout will not work if 1M regions (Matteo/Stack)
>>>>>>> + Current state of the offheaping of read path and alternate KeyValue
>>>>>>> implementation (Anoop/Ram)
>>>>>>> + Append rejigger (Elliott)
>>>>>>> + A Pv2-based Assign (Matteo/Steven)
>>>>>>> + Splitting meta/1M regions
>>>>>>> + The revived Backup (Vladimir)
>>>>>>> + Time (Enis)
>>>>>>> + The overloaded SequenceId (Stack)
>>>>>>> + Upstreaming IT testing (Dima/Sean)
>>>>>>> + hbase-2.0.0
>>>>>>> 
>>>>>>> I put names by folks I know could talk to the topic. If you want to
>>>>> take
>>>>>>> over a topic or put your name by one, just say.  Suggest that
>>>>> discussion
>>>>>>> lead off with a 5-10minute on current state of
>>>>>>> thought/design/implementation.
>>>>>>> 
>>>>>>> What do others think?
>>>>>>> 
>>>>>>> What date would suit folks?
>>>>>>> 
>>>>>>> Anyone want to host?
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Matteo and St.Ack
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Best regards,
>>>> 
>>>>    - Andy
>>>> 
>>>> Problems worthy of attack prove their worth by hitting back. - Piet Hein
>>>> (via Tom White)
>> 

Reply via email to