Re: [VOTE] Apache CouchDB 1.2.0 release, second round

Robert Newson Mon, 27 Feb 2012 05:33:24 -0800

Bob D, can you give more details on the data set you're testing?
Number of docs, size/complexity of docs, etc? Basically, enough info
that I could write a script to automate building an equivalent
database.


I wrote a quick bash script to make a database and time a view build
here: http://friendpaste.com/7kBiKJn3uX1KiGJAFPv4nK

B.

On 27 February 2012 13:15, Jan Lehnardt <j...@apache.org> wrote:
>
> On Feb 27, 2012, at 12:58 , Bob Dionne wrote:
>
>> Thanks for the clarification. I hope I'm not conflating things by continuing 
>> the discussion here, I thought that's what you requested?
>
> The discussion we had on IRC was regarding collecting more data items for the 
> performance regression before we start to draw conclusions.
>
> My intention here is to understand what needs doing before we can release 
> 1.2.0.
>
> I'll reply inline for the other issues.
>
>> I just downloaded the release candidate again to start fresh. "make 
>> distcheck" hangs on this step:
>>
>> /Users/bitdiddle/Downloads/apache-couchdb-1.2.0/apache-couchdb-1.2.0/_build/../test/etap/150-invalid-view-seq.t
>>  ......... 6/?
>>
>> Just stops completely. This is on R15B which has been rebuilt to use the 
>> recommended older SSL version. I haven't looked into this crashing too 
>> closely but I'm suspicious that I only see it with couchdb and never with 
>> bigcouch and never using the 1.2.x branch from source or any branch for that 
>> matter
>
> From the release you should run `make check`, not make distcheck. But I 
> assume you see a hang there too, as I have and others (yet not everybody), 
> too. I can't comment on BigCouch and what is different there. It is 
> interesting that 1.2.x won't hang. For me, `make check` in 1.2.x on R15B 
> hangs sometimes, in different places. I'm currently trying to gather more 
> information about this.
>
> The question here is whether `make check` passing in R15B is a release 
> requirement. In my vote I considered no, but I am happy to go with a 
> community decision if it emerges. What is your take here?
>
> In addition, this just shouldn't be a question, so we should investigate why 
> this happens at all and address the issue, hence COUCHDB-1424. Any insight 
> here would be appreciated as well.
>
>
>> In the command line tests, 2,7, 27, and 32 fail. but it differs from run to 
>> run.
>
> I assume you mean the JS tests. Again, this isn't supposed to work in 1.2.x. 
> I'm happy to backport my changes from master to 1.2.x to make that work, but 
> I refrained from that because I didn't want to bring too much change to a 
> release branch. I'm happy to reconsider, but I don't think a release vote is 
> a good place to discuss feature backports.
>
>
>> On Chrome attachment_ranges fails and it hangs on replicator_db
>
> This one is an "explaining away", but I think it is warranted. Chrome is 
> broken for attachment_ranges. I don't know if we reported this upstream 
> (Robert N?), but this isn't a release blocker. For the replicator_db test, 
> can you try running that in other browsers. I understand it is not the best 
> of situation (hence the move to the cli test suite for master), but if you 
> get this test to pass in at least one other browsers, this isn't a problem 
> that holds 1.2.x.
>
>
>> With respect to performance I think comparisons with 1.1.x are important. I 
>> think almost any use case, contrived or otherwise should not be dismissed as 
>> a pathological or edge case. Bob's script is as simple as it gets and to me 
>> is a great smoke test. We need to figure out the reason 1.2 is clearly 
>> slower in this case. If there are specific scenarios that 1.2.x is optimized 
>> for then we should document that and provide reasons for the trade-offs
>
> I want to make absolutely clear that I take any report of performance 
> regression very seriously. But I'm rather annoyed that no information about 
> this ends up on dev@. I understand that on IRC there's some shared 
> understanding of a few scenarios where performance regressions can be shown. 
> I asked three times now that these be posted to this mailing list. I'm not 
> asking for a comprehensive report, but anything really. I found Robert 
> Newson's simple test script on IRC and ran that to test a suspicion of mine 
> which I posted in an earlier mail (tiny docs -> slower, bigger docs -> 
> faster). Nobody else bothered to post this here. I see no discussion about 
> what is observed, what is expected, what would be acceptable for a release of 
> 1.2.0 as is and what not.
>
> As far as this list is concerned, we know that a few people claimed that 
> things are slower and it's very real and that we should hold the 1.2.0 
> release for it. I'm more than happy to hold the release until we figured out 
> the things I asked for above and help out figuring it all out. But we need 
> something to work with here.
>
> I also understand that this is a voluntary project and people don't have 
> infinite time to spend, but at least a message of "we're collecting things, 
> will report when done", would be *great* to start. So far we only have a 
> "hold the horses, there might be a something going on".
>
> Please let me know if this request is unreasonable or whether I am 
> overreacting.
>
> Sorry for the rant.
>
> To anyone who has been looking into performance regression, can you please 
> send to this list any info you have? If you have a comprehensive analysis, 
> awesome, if you just ran some script on a machine, just send us that, let's 
> collect all the data to get this situation solved! We need your help.
>
>
> tl;dr:
>
> There's three issues at hand:
>
>  - Robert D -1'd a release artefact. We want to understand what needs to 
> happen to make a release. This includes assessing the issues he raises and 
> squaring them against the release vote.
>
>  - There's a vague (as far as dev@ is concerned) report about a performance 
> regression. We need to get behind that.
>
>  - There's been a non-dev@ discussion about the performance regression and 
> that is referenced to influence a dev@ decision. We need that discussion's 
> information on dev@ to proceed.
>
>
> And to make it absolutely clear again. The performance regression *is* an 
> issue and I am very grateful for the people, including Robert Newson, Robert 
> Dionne and Jason Smith, who look into it. It's just that we need to treat 
> this as an issue and get all this info onto dev@ or into JRIA.
>
>
> Cheers
> Jan
> --
>
>
>
>>
>> Cheers,
>>
>> Bob
>>
>>
>> On Feb 26, 2012, at 4:07 PM, Jan Lehnardt wrote:
>>
>>> Bob,
>>>
>>> thanks for your reply
>>>
>>> I wasn't implying we should try to explain anything away. All of these are 
>>> valid concerns, I just wanted to get a better understanding on where the 
>>> bit flips from +0 to -1 and subsequently, how to address that boundary.
>>
>>
>>
>>
>>> Ideally we can just fix all of the things you mention, but I think it is 
>>> important to understand them in detail, that's why I was going into them. 
>>> Ultimately, I want to understand what we need to do to ship 1.2.0.
>>>
>>> On Feb 26, 2012, at 21:22 , Bob Dionne wrote:
>>>
>>>> Jan,
>>>>
>>>> I'm -1 based on all of my evaluation. I've spent a few hours on this 
>>>> release now yesterday and today. It doesn't really pass what I would call 
>>>> the "smoke test". Almost everything I've run into has an explanation:
>>>>
>>>> 1. crashes out of the box - that's R15B, you need to recompile SSL and 
>>>> Erlang (we'll note on release notes)
>>>
>>> Have we spent any time on figuring out what the trouble here is?
>>>
>>>
>>>> 2. etaps hang running make check. Known issue. Our etap code is out of 
>>>> date, recent versions of etap don't even run their own unit tests
>>>
>>> I have seen the etap hang as well, and I wasn't diligent enough to report 
>>> it in JIRA, I have done so now (COUCHDB-1424).
>>>
>>>
>>>> 3. Futon tests fail. Some are known bugs (attachment ranges in Chrome) . 
>>>> Both Chrome and Safari also hang
>>>
>>> Do you have more details on where Chrome and Safari hang? Can you try their 
>>> private browsing features, double/triple check that caches are empty? Can 
>>> you get to a situation where you get all tests succeeding across all 
>>> browsers, even if individual ones fail on one or two others?
>>>
>>>
>>>> 4. standalone JS tests fail. Again most of these run when run by themselves
>>>
>>> Which ones?
>>>
>>>
>>>> 5. performance. I used real production data *because* Stefan on user 
>>>> reported performance degradation on his data set. Any numbers are 
>>>> meaningless for a single test. I also ran scripts that BobN and Jason 
>>>> Smith posted that show a difference between 1.1.x and 1.2.x
>>>
>>> You are conflating an IRC discussion we've had into this thread. The 
>>> performance regression reported is a good reason to look into other 
>>> scenarios where we can show slowdowns. But we need to understand what's 
>>> happening. Just from looking at dev@ all I see is some handwaving about 
>>> some reports some people have done (Not to discourage any work that has 
>>> been done on IRC and user@, but for the sake of a release vote thread, this 
>>> related information needs to be on this mailing list).
>>>
>>> As I said on IRC, I'm happy to get my hands dirty to understand the 
>>> regression at hand. But we need to know where we'd draw a line and say this 
>>> isn't acceptable for a 1.2.0.
>>>
>>>
>>>> 6. Reviewed patch pointed to by Jason that may be the cause but it's hard 
>>>> to say without knowing the code analysis that went into the changes. You 
>>>> can see obvious local optimizations that make good sense but those are 
>>>> often the ones that get you, without knowing the call counts.
>>>
>>> That is a point that wasn't included in your previous mail. It's great that 
>>> there is progress, thanks for looking into this!
>>>
>>>
>>>> Many of these issues can be explained away, but I think end users will be 
>>>> less forgiving. I think we already struggle with view performance. I'm 
>>>> interested to see how others evaluate this regression.
>>>> I'll try this seatoncouch tool you mention later to see if I can construct 
>>>> some more definitive tests.
>>>
>>> Again, I'm not trying to explain anything away. I want to get a shared 
>>> understanding of the issues you raised and where we stand on solving them 
>>> squared against the ongoing 1.2.0 release.
>>>
>>> And again: Thanks for doing this thorough review and looking into 
>>> performance issue. I hope with your help we can understand all these things 
>>> a lot better very soon :)
>>>
>>> Cheers
>>> Jan
>>> --
>>>
>>>
>>>>
>>>> Best,
>>>>
>>>> Bob
>>>> On Feb 26, 2012, at 2:29 PM, Jan Lehnardt wrote:
>>>>
>>>>>
>>>>> On Feb 26, 2012, at 13:58 , Bob Dionne wrote:
>>>>>
>>>>>> -1
>>>>>>
>>>>>> R15B on OS X Lion
>>>>>>
>>>>>> I rebuilt OTP with an older SSL and that gets past all the crashes 
>>>>>> (thanks Filipe). I still see hangs when running make check, though any 
>>>>>> particular etap that hangs will run ok by itself. The Futon tests never 
>>>>>> run to completion in Chrome without hanging and the standalone JS tests 
>>>>>> also have fails.
>>>>>
>>>>> What part of this do you consider the -1? Can you try running the JS 
>>>>> tests in Firefox and or Safari? Can you get all tests pass at least once 
>>>>> across all browsers? The cli JS suite isn't supposed to work, so that 
>>>>> isn't a criterion. I've seen the hang in make check for R15B while 
>>>>> individual tests run as well, but I don't consider this blocking. While I 
>>>>> understand and support the notion that tests shouldn't fail, period, we 
>>>>> gotta work with what we have and master already has significant 
>>>>> improvements. What would you like to see changed to not -1 this release?
>>>>>
>>>>>> I tested the performance of view indexing, using a modest 200K doc db 
>>>>>> with a large complex view and there's a clear regression between 1.1.x 
>>>>>> and 1.2.x Others report similar results
>>>>>
>>>>> What is a large complex view? The complexity of the map/reduce functions 
>>>>> is rarely an indicator of performance, it's usually input doc size and 
>>>>> output/emit()/reduce data size. How big are the docs in your test and how 
>>>>> big is the returned data? I understand the changes for 1.2.x will improve 
>>>>> larger-data scenarios more significantly.
>>>>>
>>>>> Cheers
>>>>> Jan
>>>>> --
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> On Feb 23, 2012, at 5:25 PM, Bob Dionne wrote:
>>>>>>
>>>>>>> sorry Noah, I'm in debug mode today so I don't care to start mucking 
>>>>>>> with my stack, recompiling erlang, etc...
>>>>>>>
>>>>>>> I did try using that build repeatedly and it crashes all the time. I 
>>>>>>> find it very odd and I had seen those before as I said on my older 
>>>>>>> macbook.
>>>>>>>
>>>>>>> I do see the hangs Jan describes in the etaps, they have been there 
>>>>>>> right along, so I'm confident this just the SSL issue. Why it only 
>>>>>>> happens on the build is puzzling, any source build of any branch works 
>>>>>>> just peachy.
>>>>>>>
>>>>>>> So I'd say I'm +1 based on my use of the 1.2.x branch but I'd like to 
>>>>>>> hear from Stefan, who reported the severe performance regression. BobN 
>>>>>>> seems to think we can ignore that, it's something flaky in that 
>>>>>>> fellow's environment. I tend to agree but I'm conservative
>>>>>>>
>>>>>>> On Feb 23, 2012, at 1:23 PM, Noah Slater wrote:
>>>>>>>
>>>>>>>> Can someone convince me this bus error stuff and segfaults is not a
>>>>>>>> blocking issue.
>>>>>>>>
>>>>>>>> Bob tells me that he's followed the steps above and he's still 
>>>>>>>> experiencing
>>>>>>>> the issues.
>>>>>>>>
>>>>>>>> Bob, you did follow the steps to install your own SSL right?
>>>>>>>>
>>>>>>>> On Thu, Feb 23, 2012 at 5:09 PM, Jan Lehnardt <j...@apache.org> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Feb 23, 2012, at 00:28 , Noah Slater wrote:
>>>>>>>>>
>>>>>>>>>> Hello,
>>>>>>>>>>
>>>>>>>>>> I would like call a vote for the Apache CouchDB 1.2.0 release, second
>>>>>>>>> round.
>>>>>>>>>>
>>>>>>>>>> We encourage the whole community to download and test these
>>>>>>>>>> release artifacts so that any critical issues can be resolved before 
>>>>>>>>>> the
>>>>>>>>>> release is made. Everyone is free to vote on this release, so get 
>>>>>>>>>> stuck
>>>>>>>>> in!
>>>>>>>>>>
>>>>>>>>>> We are voting on the following release artifacts:
>>>>>>>>>>
>>>>>>>>>> http://people.apache.org/~nslater/dist/1.2.0/
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> These artifacts have been built from the following tree-ish in Git:
>>>>>>>>>>
>>>>>>>>>> 4cd60f3d1683a3445c3248f48ae064fb573db2a1
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Please follow the test procedure before voting:
>>>>>>>>>>
>>>>>>>>>> http://wiki.apache.org/couchdb/Test_procedure
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thank you.
>>>>>>>>>>
>>>>>>>>>> Happy voting,
>>>>>>>>>
>>>>>>>>> Signature and hashes check out.
>>>>>>>>>
>>>>>>>>> Mac OS X 10.7.3, 64bit, SpiderMonkey 1.8.0, Erlang R14B04: make check
>>>>>>>>> works fine, browser tests in Safari work fine.
>>>>>>>>>
>>>>>>>>> Mac OS X 10.7.3, 64bit, SpiderMonkey 1.8.5, Erlang R14B04: make check
>>>>>>>>> works fine, browser tests in Safari work fine.
>>>>>>>>>
>>>>>>>>> FreeBSD 9.0, 64bit, SpiderMonkey 1.7.0, Erlang R14B04: make check 
>>>>>>>>> works
>>>>>>>>> fine, browser tests in Safari work fine.
>>>>>>>>>
>>>>>>>>> CentOS 6.2, 64bit, SpiderMonkey 1.8.5, Erlang R14B04: make check works
>>>>>>>>> fine, browser tests in Firefox work fine.
>>>>>>>>>
>>>>>>>>> Ubuntu 11.4, 64bit, SpiderMonkey 1.8.5, Erlang R14B02: make check 
>>>>>>>>> works
>>>>>>>>> fine, browser tests in Firefox work fine.
>>>>>>>>>
>>>>>>>>> Ubuntu 10.4, 32bit, SpiderMonkey 1.8.0, Erlang R13B03: make check 
>>>>>>>>> fails in
>>>>>>>>> - 076-file-compression.t: https://gist.github.com/1893373
>>>>>>>>> - 220-compaction-daemon.t: https://gist.github.com/1893387
>>>>>>>>> This on runs in a VM and is 32bit, so I don't know if there's 
>>>>>>>>> anything in
>>>>>>>>> the tests that rely on 64bittyness or the R14B03. Filipe, I think you
>>>>>>>>> worked on both features, do you have an idea?
>>>>>>>>>
>>>>>>>>> I tried running it all through Erlang R15B on Mac OS X 1.7.3, but a 
>>>>>>>>> good
>>>>>>>>> way into `make check` the tests would just stop and hang. The last 
>>>>>>>>> time,
>>>>>>>>> repeatedly in 160-vhosts.t, but when run alone, that test finished in 
>>>>>>>>> under
>>>>>>>>> five seconds. I'm not sure what the issue is here.
>>>>>>>>>
>>>>>>>>> Despite the things above, I'm happy to give this a +1 if we put a 
>>>>>>>>> warning
>>>>>>>>> about R15B on the download page.
>>>>>>>>>
>>>>>>>>> Great work all!
>>>>>>>>>
>>>>>>>>> Cheers
>>>>>>>>> Jan
>>>>>>>>> --
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: [VOTE] Apache CouchDB 1.2.0 release, second round

Reply via email to