Re: [VOTE] Apache CouchDB 1.2.0 release, second round

Jason Smith Mon, 27 Feb 2012 16:41:39 -0800

Hi, Filipe. Most people seem to be holding their OTP build constant
for these tests.


If you have the time, would you please check out
https://github.com/jhs/slow_couchdb

It uses seatoncouch mixed with Bob's script to run a basic benchmark.
I expect more template types to grow to help create different data
profiles.

Anyway, here are my results with 500k documents. Note that I built
from your optimization commit, then its parent.

https://gist.github.com/1928169

tl;dr = 2:50 before your commit; 4:13 after.

On Mon, Feb 27, 2012 at 11:33 PM, Filipe David Manana
<fdman...@apache.org> wrote:
> I just tried Jason's script (modified it to use 500 000 docs instead
> of 50 000) against 1.2.x and 1.1.1, using OTP R14B03. Here's my
> results:
>
> 1.2.x:
>
> $ port=5984 ./test.sh
> "none"
> Filling db.
> done
> HTTP/1.1 200 OK
> Server: CouchDB/1.2.0 (Erlang OTP/R14B03)
> Date: Mon, 27 Feb 2012 16:08:43 GMT
> Content-Type: text/plain; charset=utf-8
> Content-Length: 252
> Cache-Control: must-revalidate
>
> {"db_name":"db1","doc_count":500001,"doc_del_count":0,"update_seq":500001,"purge_seq":0,"compact_running":false,"disk_size":130494577,"data_size":130490673,"instance_start_time":"1330358830830086","disk_format_version":6,"committed_update_seq":500001}
> Building view.
>
> real    1m5.725s
> user    0m0.006s
> sys     0m0.005s
> done
>
>
> 1.1.1:
>
> $ port=5984 ./test.sh
> ""
> Filling db.
> done
> HTTP/1.1 200 OK
> Server: CouchDB/1.1.2a785d32f-git (Erlang OTP/R14B03)
> Date: Mon, 27 Feb 2012 16:15:33 GMT
> Content-Type: text/plain;charset=utf-8
> Content-Length: 230
> Cache-Control: must-revalidate
>
> {"db_name":"db1","doc_count":500001,"doc_del_count":0,"update_seq":500001,"purge_seq":0,"compact_running":false,"disk_size":122142818,"instance_start_time":"1330359233327316","disk_format_version":5,"committed_update_seq":500001}
> Building view.
>
> real    1m4.249s
> user    0m0.006s
> sys     0m0.005s
> done
>
>
> I don't see any significant difference there.
>
> Regarding COUCHDB-1186, the only thing that might cause some non
> determinism and affect performance is the queing/dequeing. Depending
> on timings, it's possible the writer is dequeing less items per
> dequeue operation and therefore inserting smaller batches into the
> btree. The following small change ensures larger batches (while still
> respecting the queue max size/item count):
>
> http://friendpaste.com/178nPFgfyyeGf2vtNRpL0w
>
> Running the test with this change:
>
> $ port=5984 ./test.sh
> "none"
> Filling db.
> done
> HTTP/1.1 200 OK
> Server: CouchDB/1.2.0 (Erlang OTP/R14B03)
> Date: Mon, 27 Feb 2012 16:23:20 GMT
> Content-Type: text/plain; charset=utf-8
> Content-Length: 252
> Cache-Control: must-revalidate
>
> {"db_name":"db1","doc_count":500001,"doc_del_count":0,"update_seq":500001,"purge_seq":0,"compact_running":false,"disk_size":130494577,"data_size":130490673,"instance_start_time":"1330359706846104","disk_format_version":6,"committed_update_seq":500001}
> Building view.
>
> real    0m49.762s
> user    0m0.006s
> sys     0m0.005s
> done
>
>
> If there's no objection, I'll push that patch.
>
> Also, another note, I noticed sometime ago that with master, using OTP
> R15B I got a performance drop of 10% to 15% compared to using master
> with OTP R14B04. Maybe it applies to 1.2.x as well.
>
>
> On Mon, Feb 27, 2012 at 5:33 AM, Robert Newson <rnew...@apache.org> wrote:
>> Bob D, can you give more details on the data set you're testing?
>> Number of docs, size/complexity of docs, etc? Basically, enough info
>> that I could write a script to automate building an equivalent
>> database.
>>
>> I wrote a quick bash script to make a database and time a view build
>> here: http://friendpaste.com/7kBiKJn3uX1KiGJAFPv4nK
>>
>> B.
>>
>> On 27 February 2012 13:15, Jan Lehnardt <j...@apache.org> wrote:
>>>
>>> On Feb 27, 2012, at 12:58 , Bob Dionne wrote:
>>>
>>>> Thanks for the clarification. I hope I'm not conflating things by 
>>>> continuing the discussion here, I thought that's what you requested?
>>>
>>> The discussion we had on IRC was regarding collecting more data items for 
>>> the performance regression before we start to draw conclusions.
>>>
>>> My intention here is to understand what needs doing before we can release 
>>> 1.2.0.
>>>
>>> I'll reply inline for the other issues.
>>>
>>>> I just downloaded the release candidate again to start fresh. "make 
>>>> distcheck" hangs on this step:
>>>>
>>>> /Users/bitdiddle/Downloads/apache-couchdb-1.2.0/apache-couchdb-1.2.0/_build/../test/etap/150-invalid-view-seq.t
>>>>  ......... 6/?
>>>>
>>>> Just stops completely. This is on R15B which has been rebuilt to use the 
>>>> recommended older SSL version. I haven't looked into this crashing too 
>>>> closely but I'm suspicious that I only see it with couchdb and never with 
>>>> bigcouch and never using the 1.2.x branch from source or any branch for 
>>>> that matter
>>>
>>> From the release you should run `make check`, not make distcheck. But I 
>>> assume you see a hang there too, as I have and others (yet not everybody), 
>>> too. I can't comment on BigCouch and what is different there. It is 
>>> interesting that 1.2.x won't hang. For me, `make check` in 1.2.x on R15B 
>>> hangs sometimes, in different places. I'm currently trying to gather more 
>>> information about this.
>>>
>>> The question here is whether `make check` passing in R15B is a release 
>>> requirement. In my vote I considered no, but I am happy to go with a 
>>> community decision if it emerges. What is your take here?
>>>
>>> In addition, this just shouldn't be a question, so we should investigate 
>>> why this happens at all and address the issue, hence COUCHDB-1424. Any 
>>> insight here would be appreciated as well.
>>>
>>>
>>>> In the command line tests, 2,7, 27, and 32 fail. but it differs from run 
>>>> to run.
>>>
>>> I assume you mean the JS tests. Again, this isn't supposed to work in 
>>> 1.2.x. I'm happy to backport my changes from master to 1.2.x to make that 
>>> work, but I refrained from that because I didn't want to bring too much 
>>> change to a release branch. I'm happy to reconsider, but I don't think a 
>>> release vote is a good place to discuss feature backports.
>>>
>>>
>>>> On Chrome attachment_ranges fails and it hangs on replicator_db
>>>
>>> This one is an "explaining away", but I think it is warranted. Chrome is 
>>> broken for attachment_ranges. I don't know if we reported this upstream 
>>> (Robert N?), but this isn't a release blocker. For the replicator_db test, 
>>> can you try running that in other browsers. I understand it is not the best 
>>> of situation (hence the move to the cli test suite for master), but if you 
>>> get this test to pass in at least one other browsers, this isn't a problem 
>>> that holds 1.2.x.
>>>
>>>
>>>> With respect to performance I think comparisons with 1.1.x are important. 
>>>> I think almost any use case, contrived or otherwise should not be 
>>>> dismissed as a pathological or edge case. Bob's script is as simple as it 
>>>> gets and to me is a great smoke test. We need to figure out the reason 1.2 
>>>> is clearly slower in this case. If there are specific scenarios that 1.2.x 
>>>> is optimized for then we should document that and provide reasons for the 
>>>> trade-offs
>>>
>>> I want to make absolutely clear that I take any report of performance 
>>> regression very seriously. But I'm rather annoyed that no information about 
>>> this ends up on dev@. I understand that on IRC there's some shared 
>>> understanding of a few scenarios where performance regressions can be 
>>> shown. I asked three times now that these be posted to this mailing list. 
>>> I'm not asking for a comprehensive report, but anything really. I found 
>>> Robert Newson's simple test script on IRC and ran that to test a suspicion 
>>> of mine which I posted in an earlier mail (tiny docs -> slower, bigger docs 
>>> -> faster). Nobody else bothered to post this here. I see no discussion 
>>> about what is observed, what is expected, what would be acceptable for a 
>>> release of 1.2.0 as is and what not.
>>>
>>> As far as this list is concerned, we know that a few people claimed that 
>>> things are slower and it's very real and that we should hold the 1.2.0 
>>> release for it. I'm more than happy to hold the release until we figured 
>>> out the things I asked for above and help out figuring it all out. But we 
>>> need something to work with here.
>>>
>>> I also understand that this is a voluntary project and people don't have 
>>> infinite time to spend, but at least a message of "we're collecting things, 
>>> will report when done", would be *great* to start. So far we only have a 
>>> "hold the horses, there might be a something going on".
>>>
>>> Please let me know if this request is unreasonable or whether I am 
>>> overreacting.
>>>
>>> Sorry for the rant.
>>>
>>> To anyone who has been looking into performance regression, can you please 
>>> send to this list any info you have? If you have a comprehensive analysis, 
>>> awesome, if you just ran some script on a machine, just send us that, let's 
>>> collect all the data to get this situation solved! We need your help.
>>>
>>>
>>> tl;dr:
>>>
>>> There's three issues at hand:
>>>
>>>  - Robert D -1'd a release artefact. We want to understand what needs to 
>>> happen to make a release. This includes assessing the issues he raises and 
>>> squaring them against the release vote.
>>>
>>>  - There's a vague (as far as dev@ is concerned) report about a performance 
>>> regression. We need to get behind that.
>>>
>>>  - There's been a non-dev@ discussion about the performance regression and 
>>> that is referenced to influence a dev@ decision. We need that discussion's 
>>> information on dev@ to proceed.
>>>
>>>
>>> And to make it absolutely clear again. The performance regression *is* an 
>>> issue and I am very grateful for the people, including Robert Newson, 
>>> Robert Dionne and Jason Smith, who look into it. It's just that we need to 
>>> treat this as an issue and get all this info onto dev@ or into JRIA.
>>>
>>>
>>> Cheers
>>> Jan
>>> --
>>>
>>>
>>>
>>>>
>>>> Cheers,
>>>>
>>>> Bob
>>>>
>>>>
>>>> On Feb 26, 2012, at 4:07 PM, Jan Lehnardt wrote:
>>>>
>>>>> Bob,
>>>>>
>>>>> thanks for your reply
>>>>>
>>>>> I wasn't implying we should try to explain anything away. All of these 
>>>>> are valid concerns, I just wanted to get a better understanding on where 
>>>>> the bit flips from +0 to -1 and subsequently, how to address that 
>>>>> boundary.
>>>>
>>>>
>>>>
>>>>
>>>>> Ideally we can just fix all of the things you mention, but I think it is 
>>>>> important to understand them in detail, that's why I was going into them. 
>>>>> Ultimately, I want to understand what we need to do to ship 1.2.0.
>>>>>
>>>>> On Feb 26, 2012, at 21:22 , Bob Dionne wrote:
>>>>>
>>>>>> Jan,
>>>>>>
>>>>>> I'm -1 based on all of my evaluation. I've spent a few hours on this 
>>>>>> release now yesterday and today. It doesn't really pass what I would 
>>>>>> call the "smoke test". Almost everything I've run into has an 
>>>>>> explanation:
>>>>>>
>>>>>> 1. crashes out of the box - that's R15B, you need to recompile SSL and 
>>>>>> Erlang (we'll note on release notes)
>>>>>
>>>>> Have we spent any time on figuring out what the trouble here is?
>>>>>
>>>>>
>>>>>> 2. etaps hang running make check. Known issue. Our etap code is out of 
>>>>>> date, recent versions of etap don't even run their own unit tests
>>>>>
>>>>> I have seen the etap hang as well, and I wasn't diligent enough to report 
>>>>> it in JIRA, I have done so now (COUCHDB-1424).
>>>>>
>>>>>
>>>>>> 3. Futon tests fail. Some are known bugs (attachment ranges in Chrome) . 
>>>>>> Both Chrome and Safari also hang
>>>>>
>>>>> Do you have more details on where Chrome and Safari hang? Can you try 
>>>>> their private browsing features, double/triple check that caches are 
>>>>> empty? Can you get to a situation where you get all tests succeeding 
>>>>> across all browsers, even if individual ones fail on one or two others?
>>>>>
>>>>>
>>>>>> 4. standalone JS tests fail. Again most of these run when run by 
>>>>>> themselves
>>>>>
>>>>> Which ones?
>>>>>
>>>>>
>>>>>> 5. performance. I used real production data *because* Stefan on user 
>>>>>> reported performance degradation on his data set. Any numbers are 
>>>>>> meaningless for a single test. I also ran scripts that BobN and Jason 
>>>>>> Smith posted that show a difference between 1.1.x and 1.2.x
>>>>>
>>>>> You are conflating an IRC discussion we've had into this thread. The 
>>>>> performance regression reported is a good reason to look into other 
>>>>> scenarios where we can show slowdowns. But we need to understand what's 
>>>>> happening. Just from looking at dev@ all I see is some handwaving about 
>>>>> some reports some people have done (Not to discourage any work that has 
>>>>> been done on IRC and user@, but for the sake of a release vote thread, 
>>>>> this related information needs to be on this mailing list).
>>>>>
>>>>> As I said on IRC, I'm happy to get my hands dirty to understand the 
>>>>> regression at hand. But we need to know where we'd draw a line and say 
>>>>> this isn't acceptable for a 1.2.0.
>>>>>
>>>>>
>>>>>> 6. Reviewed patch pointed to by Jason that may be the cause but it's 
>>>>>> hard to say without knowing the code analysis that went into the 
>>>>>> changes. You can see obvious local optimizations that make good sense 
>>>>>> but those are often the ones that get you, without knowing the call 
>>>>>> counts.
>>>>>
>>>>> That is a point that wasn't included in your previous mail. It's great 
>>>>> that there is progress, thanks for looking into this!
>>>>>
>>>>>
>>>>>> Many of these issues can be explained away, but I think end users will 
>>>>>> be less forgiving. I think we already struggle with view performance. 
>>>>>> I'm interested to see how others evaluate this regression.
>>>>>> I'll try this seatoncouch tool you mention later to see if I can 
>>>>>> construct some more definitive tests.
>>>>>
>>>>> Again, I'm not trying to explain anything away. I want to get a shared 
>>>>> understanding of the issues you raised and where we stand on solving them 
>>>>> squared against the ongoing 1.2.0 release.
>>>>>
>>>>> And again: Thanks for doing this thorough review and looking into 
>>>>> performance issue. I hope with your help we can understand all these 
>>>>> things a lot better very soon :)
>>>>>
>>>>> Cheers
>>>>> Jan
>>>>> --
>>>>>
>>>>>
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Bob
>>>>>> On Feb 26, 2012, at 2:29 PM, Jan Lehnardt wrote:
>>>>>>
>>>>>>>
>>>>>>> On Feb 26, 2012, at 13:58 , Bob Dionne wrote:
>>>>>>>
>>>>>>>> -1
>>>>>>>>
>>>>>>>> R15B on OS X Lion
>>>>>>>>
>>>>>>>> I rebuilt OTP with an older SSL and that gets past all the crashes 
>>>>>>>> (thanks Filipe). I still see hangs when running make check, though any 
>>>>>>>> particular etap that hangs will run ok by itself. The Futon tests 
>>>>>>>> never run to completion in Chrome without hanging and the standalone 
>>>>>>>> JS tests also have fails.
>>>>>>>
>>>>>>> What part of this do you consider the -1? Can you try running the JS 
>>>>>>> tests in Firefox and or Safari? Can you get all tests pass at least 
>>>>>>> once across all browsers? The cli JS suite isn't supposed to work, so 
>>>>>>> that isn't a criterion. I've seen the hang in make check for R15B while 
>>>>>>> individual tests run as well, but I don't consider this blocking. While 
>>>>>>> I understand and support the notion that tests shouldn't fail, period, 
>>>>>>> we gotta work with what we have and master already has significant 
>>>>>>> improvements. What would you like to see changed to not -1 this release?
>>>>>>>
>>>>>>>> I tested the performance of view indexing, using a modest 200K doc db 
>>>>>>>> with a large complex view and there's a clear regression between 1.1.x 
>>>>>>>> and 1.2.x Others report similar results
>>>>>>>
>>>>>>> What is a large complex view? The complexity of the map/reduce 
>>>>>>> functions is rarely an indicator of performance, it's usually input doc 
>>>>>>> size and output/emit()/reduce data size. How big are the docs in your 
>>>>>>> test and how big is the returned data? I understand the changes for 
>>>>>>> 1.2.x will improve larger-data scenarios more significantly.
>>>>>>>
>>>>>>> Cheers
>>>>>>> Jan
>>>>>>> --
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> On Feb 23, 2012, at 5:25 PM, Bob Dionne wrote:
>>>>>>>>
>>>>>>>>> sorry Noah, I'm in debug mode today so I don't care to start mucking 
>>>>>>>>> with my stack, recompiling erlang, etc...
>>>>>>>>>
>>>>>>>>> I did try using that build repeatedly and it crashes all the time. I 
>>>>>>>>> find it very odd and I had seen those before as I said on my older 
>>>>>>>>> macbook.
>>>>>>>>>
>>>>>>>>> I do see the hangs Jan describes in the etaps, they have been there 
>>>>>>>>> right along, so I'm confident this just the SSL issue. Why it only 
>>>>>>>>> happens on the build is puzzling, any source build of any branch 
>>>>>>>>> works just peachy.
>>>>>>>>>
>>>>>>>>> So I'd say I'm +1 based on my use of the 1.2.x branch but I'd like to 
>>>>>>>>> hear from Stefan, who reported the severe performance regression. 
>>>>>>>>> BobN seems to think we can ignore that, it's something flaky in that 
>>>>>>>>> fellow's environment. I tend to agree but I'm conservative
>>>>>>>>>
>>>>>>>>> On Feb 23, 2012, at 1:23 PM, Noah Slater wrote:
>>>>>>>>>
>>>>>>>>>> Can someone convince me this bus error stuff and segfaults is not a
>>>>>>>>>> blocking issue.
>>>>>>>>>>
>>>>>>>>>> Bob tells me that he's followed the steps above and he's still 
>>>>>>>>>> experiencing
>>>>>>>>>> the issues.
>>>>>>>>>>
>>>>>>>>>> Bob, you did follow the steps to install your own SSL right?
>>>>>>>>>>
>>>>>>>>>> On Thu, Feb 23, 2012 at 5:09 PM, Jan Lehnardt <j...@apache.org> 
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Feb 23, 2012, at 00:28 , Noah Slater wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hello,
>>>>>>>>>>>>
>>>>>>>>>>>> I would like call a vote for the Apache CouchDB 1.2.0 release, 
>>>>>>>>>>>> second
>>>>>>>>>>> round.
>>>>>>>>>>>>
>>>>>>>>>>>> We encourage the whole community to download and test these
>>>>>>>>>>>> release artifacts so that any critical issues can be resolved 
>>>>>>>>>>>> before the
>>>>>>>>>>>> release is made. Everyone is free to vote on this release, so get 
>>>>>>>>>>>> stuck
>>>>>>>>>>> in!
>>>>>>>>>>>>
>>>>>>>>>>>> We are voting on the following release artifacts:
>>>>>>>>>>>>
>>>>>>>>>>>> http://people.apache.org/~nslater/dist/1.2.0/
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> These artifacts have been built from the following tree-ish in Git:
>>>>>>>>>>>>
>>>>>>>>>>>> 4cd60f3d1683a3445c3248f48ae064fb573db2a1
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Please follow the test procedure before voting:
>>>>>>>>>>>>
>>>>>>>>>>>> http://wiki.apache.org/couchdb/Test_procedure
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thank you.
>>>>>>>>>>>>
>>>>>>>>>>>> Happy voting,
>>>>>>>>>>>
>>>>>>>>>>> Signature and hashes check out.
>>>>>>>>>>>
>>>>>>>>>>> Mac OS X 10.7.3, 64bit, SpiderMonkey 1.8.0, Erlang R14B04: make 
>>>>>>>>>>> check
>>>>>>>>>>> works fine, browser tests in Safari work fine.
>>>>>>>>>>>
>>>>>>>>>>> Mac OS X 10.7.3, 64bit, SpiderMonkey 1.8.5, Erlang R14B04: make 
>>>>>>>>>>> check
>>>>>>>>>>> works fine, browser tests in Safari work fine.
>>>>>>>>>>>
>>>>>>>>>>> FreeBSD 9.0, 64bit, SpiderMonkey 1.7.0, Erlang R14B04: make check 
>>>>>>>>>>> works
>>>>>>>>>>> fine, browser tests in Safari work fine.
>>>>>>>>>>>
>>>>>>>>>>> CentOS 6.2, 64bit, SpiderMonkey 1.8.5, Erlang R14B04: make check 
>>>>>>>>>>> works
>>>>>>>>>>> fine, browser tests in Firefox work fine.
>>>>>>>>>>>
>>>>>>>>>>> Ubuntu 11.4, 64bit, SpiderMonkey 1.8.5, Erlang R14B02: make check 
>>>>>>>>>>> works
>>>>>>>>>>> fine, browser tests in Firefox work fine.
>>>>>>>>>>>
>>>>>>>>>>> Ubuntu 10.4, 32bit, SpiderMonkey 1.8.0, Erlang R13B03: make check 
>>>>>>>>>>> fails in
>>>>>>>>>>> - 076-file-compression.t: https://gist.github.com/1893373
>>>>>>>>>>> - 220-compaction-daemon.t: https://gist.github.com/1893387
>>>>>>>>>>> This on runs in a VM and is 32bit, so I don't know if there's 
>>>>>>>>>>> anything in
>>>>>>>>>>> the tests that rely on 64bittyness or the R14B03. Filipe, I think 
>>>>>>>>>>> you
>>>>>>>>>>> worked on both features, do you have an idea?
>>>>>>>>>>>
>>>>>>>>>>> I tried running it all through Erlang R15B on Mac OS X 1.7.3, but a 
>>>>>>>>>>> good
>>>>>>>>>>> way into `make check` the tests would just stop and hang. The last 
>>>>>>>>>>> time,
>>>>>>>>>>> repeatedly in 160-vhosts.t, but when run alone, that test finished 
>>>>>>>>>>> in under
>>>>>>>>>>> five seconds. I'm not sure what the issue is here.
>>>>>>>>>>>
>>>>>>>>>>> Despite the things above, I'm happy to give this a +1 if we put a 
>>>>>>>>>>> warning
>>>>>>>>>>> about R15B on the download page.
>>>>>>>>>>>
>>>>>>>>>>> Great work all!
>>>>>>>>>>>
>>>>>>>>>>> Cheers
>>>>>>>>>>> Jan
>>>>>>>>>>> --
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>
>
>
> --
> Filipe David Manana,
>
> "Reasonable men adapt themselves to the world.
>  Unreasonable men adapt the world to themselves.
>  That's why all progress depends on unreasonable men."

Re: [VOTE] Apache CouchDB 1.2.0 release, second round

Reply via email to