Hi, Filipe. Most people seem to be holding their OTP build constant for these tests.
If you have the time, would you please check out https://github.com/jhs/slow_couchdb It uses seatoncouch mixed with Bob's script to run a basic benchmark. I expect more template types to grow to help create different data profiles. Anyway, here are my results with 500k documents. Note that I built from your optimization commit, then its parent. https://gist.github.com/1928169 tl;dr = 2:50 before your commit; 4:13 after. On Mon, Feb 27, 2012 at 11:33 PM, Filipe David Manana <fdman...@apache.org> wrote: > I just tried Jason's script (modified it to use 500 000 docs instead > of 50 000) against 1.2.x and 1.1.1, using OTP R14B03. Here's my > results: > > 1.2.x: > > $ port=5984 ./test.sh > "none" > Filling db. > done > HTTP/1.1 200 OK > Server: CouchDB/1.2.0 (Erlang OTP/R14B03) > Date: Mon, 27 Feb 2012 16:08:43 GMT > Content-Type: text/plain; charset=utf-8 > Content-Length: 252 > Cache-Control: must-revalidate > > {"db_name":"db1","doc_count":500001,"doc_del_count":0,"update_seq":500001,"purge_seq":0,"compact_running":false,"disk_size":130494577,"data_size":130490673,"instance_start_time":"1330358830830086","disk_format_version":6,"committed_update_seq":500001} > Building view. > > real 1m5.725s > user 0m0.006s > sys 0m0.005s > done > > > 1.1.1: > > $ port=5984 ./test.sh > "" > Filling db. > done > HTTP/1.1 200 OK > Server: CouchDB/1.1.2a785d32f-git (Erlang OTP/R14B03) > Date: Mon, 27 Feb 2012 16:15:33 GMT > Content-Type: text/plain;charset=utf-8 > Content-Length: 230 > Cache-Control: must-revalidate > > {"db_name":"db1","doc_count":500001,"doc_del_count":0,"update_seq":500001,"purge_seq":0,"compact_running":false,"disk_size":122142818,"instance_start_time":"1330359233327316","disk_format_version":5,"committed_update_seq":500001} > Building view. > > real 1m4.249s > user 0m0.006s > sys 0m0.005s > done > > > I don't see any significant difference there. > > Regarding COUCHDB-1186, the only thing that might cause some non > determinism and affect performance is the queing/dequeing. Depending > on timings, it's possible the writer is dequeing less items per > dequeue operation and therefore inserting smaller batches into the > btree. The following small change ensures larger batches (while still > respecting the queue max size/item count): > > http://friendpaste.com/178nPFgfyyeGf2vtNRpL0w > > Running the test with this change: > > $ port=5984 ./test.sh > "none" > Filling db. > done > HTTP/1.1 200 OK > Server: CouchDB/1.2.0 (Erlang OTP/R14B03) > Date: Mon, 27 Feb 2012 16:23:20 GMT > Content-Type: text/plain; charset=utf-8 > Content-Length: 252 > Cache-Control: must-revalidate > > {"db_name":"db1","doc_count":500001,"doc_del_count":0,"update_seq":500001,"purge_seq":0,"compact_running":false,"disk_size":130494577,"data_size":130490673,"instance_start_time":"1330359706846104","disk_format_version":6,"committed_update_seq":500001} > Building view. > > real 0m49.762s > user 0m0.006s > sys 0m0.005s > done > > > If there's no objection, I'll push that patch. > > Also, another note, I noticed sometime ago that with master, using OTP > R15B I got a performance drop of 10% to 15% compared to using master > with OTP R14B04. Maybe it applies to 1.2.x as well. > > > On Mon, Feb 27, 2012 at 5:33 AM, Robert Newson <rnew...@apache.org> wrote: >> Bob D, can you give more details on the data set you're testing? >> Number of docs, size/complexity of docs, etc? Basically, enough info >> that I could write a script to automate building an equivalent >> database. >> >> I wrote a quick bash script to make a database and time a view build >> here: http://friendpaste.com/7kBiKJn3uX1KiGJAFPv4nK >> >> B. >> >> On 27 February 2012 13:15, Jan Lehnardt <j...@apache.org> wrote: >>> >>> On Feb 27, 2012, at 12:58 , Bob Dionne wrote: >>> >>>> Thanks for the clarification. I hope I'm not conflating things by >>>> continuing the discussion here, I thought that's what you requested? >>> >>> The discussion we had on IRC was regarding collecting more data items for >>> the performance regression before we start to draw conclusions. >>> >>> My intention here is to understand what needs doing before we can release >>> 1.2.0. >>> >>> I'll reply inline for the other issues. >>> >>>> I just downloaded the release candidate again to start fresh. "make >>>> distcheck" hangs on this step: >>>> >>>> /Users/bitdiddle/Downloads/apache-couchdb-1.2.0/apache-couchdb-1.2.0/_build/../test/etap/150-invalid-view-seq.t >>>> ......... 6/? >>>> >>>> Just stops completely. This is on R15B which has been rebuilt to use the >>>> recommended older SSL version. I haven't looked into this crashing too >>>> closely but I'm suspicious that I only see it with couchdb and never with >>>> bigcouch and never using the 1.2.x branch from source or any branch for >>>> that matter >>> >>> From the release you should run `make check`, not make distcheck. But I >>> assume you see a hang there too, as I have and others (yet not everybody), >>> too. I can't comment on BigCouch and what is different there. It is >>> interesting that 1.2.x won't hang. For me, `make check` in 1.2.x on R15B >>> hangs sometimes, in different places. I'm currently trying to gather more >>> information about this. >>> >>> The question here is whether `make check` passing in R15B is a release >>> requirement. In my vote I considered no, but I am happy to go with a >>> community decision if it emerges. What is your take here? >>> >>> In addition, this just shouldn't be a question, so we should investigate >>> why this happens at all and address the issue, hence COUCHDB-1424. Any >>> insight here would be appreciated as well. >>> >>> >>>> In the command line tests, 2,7, 27, and 32 fail. but it differs from run >>>> to run. >>> >>> I assume you mean the JS tests. Again, this isn't supposed to work in >>> 1.2.x. I'm happy to backport my changes from master to 1.2.x to make that >>> work, but I refrained from that because I didn't want to bring too much >>> change to a release branch. I'm happy to reconsider, but I don't think a >>> release vote is a good place to discuss feature backports. >>> >>> >>>> On Chrome attachment_ranges fails and it hangs on replicator_db >>> >>> This one is an "explaining away", but I think it is warranted. Chrome is >>> broken for attachment_ranges. I don't know if we reported this upstream >>> (Robert N?), but this isn't a release blocker. For the replicator_db test, >>> can you try running that in other browsers. I understand it is not the best >>> of situation (hence the move to the cli test suite for master), but if you >>> get this test to pass in at least one other browsers, this isn't a problem >>> that holds 1.2.x. >>> >>> >>>> With respect to performance I think comparisons with 1.1.x are important. >>>> I think almost any use case, contrived or otherwise should not be >>>> dismissed as a pathological or edge case. Bob's script is as simple as it >>>> gets and to me is a great smoke test. We need to figure out the reason 1.2 >>>> is clearly slower in this case. If there are specific scenarios that 1.2.x >>>> is optimized for then we should document that and provide reasons for the >>>> trade-offs >>> >>> I want to make absolutely clear that I take any report of performance >>> regression very seriously. But I'm rather annoyed that no information about >>> this ends up on dev@. I understand that on IRC there's some shared >>> understanding of a few scenarios where performance regressions can be >>> shown. I asked three times now that these be posted to this mailing list. >>> I'm not asking for a comprehensive report, but anything really. I found >>> Robert Newson's simple test script on IRC and ran that to test a suspicion >>> of mine which I posted in an earlier mail (tiny docs -> slower, bigger docs >>> -> faster). Nobody else bothered to post this here. I see no discussion >>> about what is observed, what is expected, what would be acceptable for a >>> release of 1.2.0 as is and what not. >>> >>> As far as this list is concerned, we know that a few people claimed that >>> things are slower and it's very real and that we should hold the 1.2.0 >>> release for it. I'm more than happy to hold the release until we figured >>> out the things I asked for above and help out figuring it all out. But we >>> need something to work with here. >>> >>> I also understand that this is a voluntary project and people don't have >>> infinite time to spend, but at least a message of "we're collecting things, >>> will report when done", would be *great* to start. So far we only have a >>> "hold the horses, there might be a something going on". >>> >>> Please let me know if this request is unreasonable or whether I am >>> overreacting. >>> >>> Sorry for the rant. >>> >>> To anyone who has been looking into performance regression, can you please >>> send to this list any info you have? If you have a comprehensive analysis, >>> awesome, if you just ran some script on a machine, just send us that, let's >>> collect all the data to get this situation solved! We need your help. >>> >>> >>> tl;dr: >>> >>> There's three issues at hand: >>> >>> - Robert D -1'd a release artefact. We want to understand what needs to >>> happen to make a release. This includes assessing the issues he raises and >>> squaring them against the release vote. >>> >>> - There's a vague (as far as dev@ is concerned) report about a performance >>> regression. We need to get behind that. >>> >>> - There's been a non-dev@ discussion about the performance regression and >>> that is referenced to influence a dev@ decision. We need that discussion's >>> information on dev@ to proceed. >>> >>> >>> And to make it absolutely clear again. The performance regression *is* an >>> issue and I am very grateful for the people, including Robert Newson, >>> Robert Dionne and Jason Smith, who look into it. It's just that we need to >>> treat this as an issue and get all this info onto dev@ or into JRIA. >>> >>> >>> Cheers >>> Jan >>> -- >>> >>> >>> >>>> >>>> Cheers, >>>> >>>> Bob >>>> >>>> >>>> On Feb 26, 2012, at 4:07 PM, Jan Lehnardt wrote: >>>> >>>>> Bob, >>>>> >>>>> thanks for your reply >>>>> >>>>> I wasn't implying we should try to explain anything away. All of these >>>>> are valid concerns, I just wanted to get a better understanding on where >>>>> the bit flips from +0 to -1 and subsequently, how to address that >>>>> boundary. >>>> >>>> >>>> >>>> >>>>> Ideally we can just fix all of the things you mention, but I think it is >>>>> important to understand them in detail, that's why I was going into them. >>>>> Ultimately, I want to understand what we need to do to ship 1.2.0. >>>>> >>>>> On Feb 26, 2012, at 21:22 , Bob Dionne wrote: >>>>> >>>>>> Jan, >>>>>> >>>>>> I'm -1 based on all of my evaluation. I've spent a few hours on this >>>>>> release now yesterday and today. It doesn't really pass what I would >>>>>> call the "smoke test". Almost everything I've run into has an >>>>>> explanation: >>>>>> >>>>>> 1. crashes out of the box - that's R15B, you need to recompile SSL and >>>>>> Erlang (we'll note on release notes) >>>>> >>>>> Have we spent any time on figuring out what the trouble here is? >>>>> >>>>> >>>>>> 2. etaps hang running make check. Known issue. Our etap code is out of >>>>>> date, recent versions of etap don't even run their own unit tests >>>>> >>>>> I have seen the etap hang as well, and I wasn't diligent enough to report >>>>> it in JIRA, I have done so now (COUCHDB-1424). >>>>> >>>>> >>>>>> 3. Futon tests fail. Some are known bugs (attachment ranges in Chrome) . >>>>>> Both Chrome and Safari also hang >>>>> >>>>> Do you have more details on where Chrome and Safari hang? Can you try >>>>> their private browsing features, double/triple check that caches are >>>>> empty? Can you get to a situation where you get all tests succeeding >>>>> across all browsers, even if individual ones fail on one or two others? >>>>> >>>>> >>>>>> 4. standalone JS tests fail. Again most of these run when run by >>>>>> themselves >>>>> >>>>> Which ones? >>>>> >>>>> >>>>>> 5. performance. I used real production data *because* Stefan on user >>>>>> reported performance degradation on his data set. Any numbers are >>>>>> meaningless for a single test. I also ran scripts that BobN and Jason >>>>>> Smith posted that show a difference between 1.1.x and 1.2.x >>>>> >>>>> You are conflating an IRC discussion we've had into this thread. The >>>>> performance regression reported is a good reason to look into other >>>>> scenarios where we can show slowdowns. But we need to understand what's >>>>> happening. Just from looking at dev@ all I see is some handwaving about >>>>> some reports some people have done (Not to discourage any work that has >>>>> been done on IRC and user@, but for the sake of a release vote thread, >>>>> this related information needs to be on this mailing list). >>>>> >>>>> As I said on IRC, I'm happy to get my hands dirty to understand the >>>>> regression at hand. But we need to know where we'd draw a line and say >>>>> this isn't acceptable for a 1.2.0. >>>>> >>>>> >>>>>> 6. Reviewed patch pointed to by Jason that may be the cause but it's >>>>>> hard to say without knowing the code analysis that went into the >>>>>> changes. You can see obvious local optimizations that make good sense >>>>>> but those are often the ones that get you, without knowing the call >>>>>> counts. >>>>> >>>>> That is a point that wasn't included in your previous mail. It's great >>>>> that there is progress, thanks for looking into this! >>>>> >>>>> >>>>>> Many of these issues can be explained away, but I think end users will >>>>>> be less forgiving. I think we already struggle with view performance. >>>>>> I'm interested to see how others evaluate this regression. >>>>>> I'll try this seatoncouch tool you mention later to see if I can >>>>>> construct some more definitive tests. >>>>> >>>>> Again, I'm not trying to explain anything away. I want to get a shared >>>>> understanding of the issues you raised and where we stand on solving them >>>>> squared against the ongoing 1.2.0 release. >>>>> >>>>> And again: Thanks for doing this thorough review and looking into >>>>> performance issue. I hope with your help we can understand all these >>>>> things a lot better very soon :) >>>>> >>>>> Cheers >>>>> Jan >>>>> -- >>>>> >>>>> >>>>>> >>>>>> Best, >>>>>> >>>>>> Bob >>>>>> On Feb 26, 2012, at 2:29 PM, Jan Lehnardt wrote: >>>>>> >>>>>>> >>>>>>> On Feb 26, 2012, at 13:58 , Bob Dionne wrote: >>>>>>> >>>>>>>> -1 >>>>>>>> >>>>>>>> R15B on OS X Lion >>>>>>>> >>>>>>>> I rebuilt OTP with an older SSL and that gets past all the crashes >>>>>>>> (thanks Filipe). I still see hangs when running make check, though any >>>>>>>> particular etap that hangs will run ok by itself. The Futon tests >>>>>>>> never run to completion in Chrome without hanging and the standalone >>>>>>>> JS tests also have fails. >>>>>>> >>>>>>> What part of this do you consider the -1? Can you try running the JS >>>>>>> tests in Firefox and or Safari? Can you get all tests pass at least >>>>>>> once across all browsers? The cli JS suite isn't supposed to work, so >>>>>>> that isn't a criterion. I've seen the hang in make check for R15B while >>>>>>> individual tests run as well, but I don't consider this blocking. While >>>>>>> I understand and support the notion that tests shouldn't fail, period, >>>>>>> we gotta work with what we have and master already has significant >>>>>>> improvements. What would you like to see changed to not -1 this release? >>>>>>> >>>>>>>> I tested the performance of view indexing, using a modest 200K doc db >>>>>>>> with a large complex view and there's a clear regression between 1.1.x >>>>>>>> and 1.2.x Others report similar results >>>>>>> >>>>>>> What is a large complex view? The complexity of the map/reduce >>>>>>> functions is rarely an indicator of performance, it's usually input doc >>>>>>> size and output/emit()/reduce data size. How big are the docs in your >>>>>>> test and how big is the returned data? I understand the changes for >>>>>>> 1.2.x will improve larger-data scenarios more significantly. >>>>>>> >>>>>>> Cheers >>>>>>> Jan >>>>>>> -- >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> On Feb 23, 2012, at 5:25 PM, Bob Dionne wrote: >>>>>>>> >>>>>>>>> sorry Noah, I'm in debug mode today so I don't care to start mucking >>>>>>>>> with my stack, recompiling erlang, etc... >>>>>>>>> >>>>>>>>> I did try using that build repeatedly and it crashes all the time. I >>>>>>>>> find it very odd and I had seen those before as I said on my older >>>>>>>>> macbook. >>>>>>>>> >>>>>>>>> I do see the hangs Jan describes in the etaps, they have been there >>>>>>>>> right along, so I'm confident this just the SSL issue. Why it only >>>>>>>>> happens on the build is puzzling, any source build of any branch >>>>>>>>> works just peachy. >>>>>>>>> >>>>>>>>> So I'd say I'm +1 based on my use of the 1.2.x branch but I'd like to >>>>>>>>> hear from Stefan, who reported the severe performance regression. >>>>>>>>> BobN seems to think we can ignore that, it's something flaky in that >>>>>>>>> fellow's environment. I tend to agree but I'm conservative >>>>>>>>> >>>>>>>>> On Feb 23, 2012, at 1:23 PM, Noah Slater wrote: >>>>>>>>> >>>>>>>>>> Can someone convince me this bus error stuff and segfaults is not a >>>>>>>>>> blocking issue. >>>>>>>>>> >>>>>>>>>> Bob tells me that he's followed the steps above and he's still >>>>>>>>>> experiencing >>>>>>>>>> the issues. >>>>>>>>>> >>>>>>>>>> Bob, you did follow the steps to install your own SSL right? >>>>>>>>>> >>>>>>>>>> On Thu, Feb 23, 2012 at 5:09 PM, Jan Lehnardt <j...@apache.org> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Feb 23, 2012, at 00:28 , Noah Slater wrote: >>>>>>>>>>> >>>>>>>>>>>> Hello, >>>>>>>>>>>> >>>>>>>>>>>> I would like call a vote for the Apache CouchDB 1.2.0 release, >>>>>>>>>>>> second >>>>>>>>>>> round. >>>>>>>>>>>> >>>>>>>>>>>> We encourage the whole community to download and test these >>>>>>>>>>>> release artifacts so that any critical issues can be resolved >>>>>>>>>>>> before the >>>>>>>>>>>> release is made. Everyone is free to vote on this release, so get >>>>>>>>>>>> stuck >>>>>>>>>>> in! >>>>>>>>>>>> >>>>>>>>>>>> We are voting on the following release artifacts: >>>>>>>>>>>> >>>>>>>>>>>> http://people.apache.org/~nslater/dist/1.2.0/ >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> These artifacts have been built from the following tree-ish in Git: >>>>>>>>>>>> >>>>>>>>>>>> 4cd60f3d1683a3445c3248f48ae064fb573db2a1 >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Please follow the test procedure before voting: >>>>>>>>>>>> >>>>>>>>>>>> http://wiki.apache.org/couchdb/Test_procedure >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thank you. >>>>>>>>>>>> >>>>>>>>>>>> Happy voting, >>>>>>>>>>> >>>>>>>>>>> Signature and hashes check out. >>>>>>>>>>> >>>>>>>>>>> Mac OS X 10.7.3, 64bit, SpiderMonkey 1.8.0, Erlang R14B04: make >>>>>>>>>>> check >>>>>>>>>>> works fine, browser tests in Safari work fine. >>>>>>>>>>> >>>>>>>>>>> Mac OS X 10.7.3, 64bit, SpiderMonkey 1.8.5, Erlang R14B04: make >>>>>>>>>>> check >>>>>>>>>>> works fine, browser tests in Safari work fine. >>>>>>>>>>> >>>>>>>>>>> FreeBSD 9.0, 64bit, SpiderMonkey 1.7.0, Erlang R14B04: make check >>>>>>>>>>> works >>>>>>>>>>> fine, browser tests in Safari work fine. >>>>>>>>>>> >>>>>>>>>>> CentOS 6.2, 64bit, SpiderMonkey 1.8.5, Erlang R14B04: make check >>>>>>>>>>> works >>>>>>>>>>> fine, browser tests in Firefox work fine. >>>>>>>>>>> >>>>>>>>>>> Ubuntu 11.4, 64bit, SpiderMonkey 1.8.5, Erlang R14B02: make check >>>>>>>>>>> works >>>>>>>>>>> fine, browser tests in Firefox work fine. >>>>>>>>>>> >>>>>>>>>>> Ubuntu 10.4, 32bit, SpiderMonkey 1.8.0, Erlang R13B03: make check >>>>>>>>>>> fails in >>>>>>>>>>> - 076-file-compression.t: https://gist.github.com/1893373 >>>>>>>>>>> - 220-compaction-daemon.t: https://gist.github.com/1893387 >>>>>>>>>>> This on runs in a VM and is 32bit, so I don't know if there's >>>>>>>>>>> anything in >>>>>>>>>>> the tests that rely on 64bittyness or the R14B03. Filipe, I think >>>>>>>>>>> you >>>>>>>>>>> worked on both features, do you have an idea? >>>>>>>>>>> >>>>>>>>>>> I tried running it all through Erlang R15B on Mac OS X 1.7.3, but a >>>>>>>>>>> good >>>>>>>>>>> way into `make check` the tests would just stop and hang. The last >>>>>>>>>>> time, >>>>>>>>>>> repeatedly in 160-vhosts.t, but when run alone, that test finished >>>>>>>>>>> in under >>>>>>>>>>> five seconds. I'm not sure what the issue is here. >>>>>>>>>>> >>>>>>>>>>> Despite the things above, I'm happy to give this a +1 if we put a >>>>>>>>>>> warning >>>>>>>>>>> about R15B on the download page. >>>>>>>>>>> >>>>>>>>>>> Great work all! >>>>>>>>>>> >>>>>>>>>>> Cheers >>>>>>>>>>> Jan >>>>>>>>>>> -- >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> > > > > -- > Filipe David Manana, > > "Reasonable men adapt themselves to the world. > Unreasonable men adapt the world to themselves. > That's why all progress depends on unreasonable men."