Re: worst performance in 1.1.x (compared to 1.0.x)

2010-12-13 Thread Filipe David Manana
Adam,

My tests were done with OTP R14B, dual-core Thinkpad (with Linux) and
a 5400 rpms hard disk.

I think the cause for such huge differences between our tests is the
OS. The time I spent with Joel, we always had very different results
for the same tests. He also used a Macbook pro (with OS X). (I've
heard several times that OS X's IO scheduler is worst than Linux's
one).

It's weird that in your 2nd graph, trunk reads are a lot worse
compared 4b0948d-after-R14B01-reads.
Do you also have a Linux machine to test this?


On Mon, Dec 13, 2010 at 3:02 AM, Adam Kocoloski kocol...@apache.org wrote:
 I tried R14B01, the custom small_doc.json, and lowering schedulers_online to 
 1, but the writers continue to starve.  Trunk (4137a8e) does not starve the 
 writers, almost certainly due to your updater_fd patch.  Comparing trunk and 
 the mochiweb upgrade commit I get

 http://graphs.mikeal.couchone.com/#/graph/df0f79455c9c600f66d1ce42ea018d7e

 I think it's still important that you observe a performance regression with 
 the introduction of that patch, but  something else in our respective setups 
 clearly has a much greater effect on the results.  Regards,

 Adam

 On Dec 12, 2010, at 8:42 PM, Adam Kocoloski wrote:

 Hi Filipe, I cannot reproduce those results at all, though I didn't try 
 loading the custom small_doc.json.  If I use 200 readers and 100 writers the 
 writers are completely starved and I get e.g.

 http://graphs.mikeal.couchone.com/#/graph/df0f79455c9c600f66d1ce42ea016e07

 I need to lower the readers down to ~20 to keep the write throughput 
 reasonable.  I'm running R13B04 on a dual-core OS X 10.6 MacBook.  I'll try 
 a few more things including the custom doc and R14B01, but do you have 
 suggestions for why these results might be so dramatically different?

 Adam

 On Dec 12, 2010, at 7:17 AM, Filipe David Manana wrote:

 Hi,

 While running a relaximation test to compare read/write performance
 between 1.1.x and 1.0.x, I found out that 1.1.x has worst performance

 It seems the cause is related to the new Mochiweb version introduced
 in commit 4b0948ddb3a428f8a5330e05745b2fbd4ccf9375 -
 https://github.com/apache/couchdb/commit/4b0948ddb3a428f8a5330e05745b2fbd4ccf9375

 Comparing the performance of this revision with the previous one
 (cd214b23e8129868d4a7020ddafd55a16e496652), I get the following
 results:

 http://graphs.mikeal.couchone.com/#/graph/df0f79455c9c600f66d1ce42ea0125e5

 Both read and write performance are worse for
 4b0948ddb3a428f8a5330e05745b2fbd4ccf9375.

 The cause could be the configuration we pass to Mochiweb in
 couch_httpd. The new Mochiweb sets the nodelay option to false by
 default and it uses now several acceptor processes (16 by default).
 However even with the following small patch I still get about the same
 disappointing results:

 diff --git a/src/couchdb/couch_httpd.erl b/src/couchdb/couch_httpd.erl
 index 23ff7f9..e93c7e7 100644
 --- a/src/couchdb/couch_httpd.erl
 +++ b/src/couchdb/couch_httpd.erl
 @@ -97,7 +97,9 @@ start_link(Name, Options) -
    {ok, Pid} = case mochiweb_http:start(Options ++ [
        {loop, Loop},
        {name, Name},
 -        {ip, BindAddress}
 +        {ip, BindAddress},
 +        {nodelay, true},
 +        {acceptor_pool_size, 32}
    ]) of
    {ok, MochiPid} - {ok, MochiPid};
    {error, Reason} -


 (Also tried higher values for acceptor_pool_size, which doesn't help)

 The test was run like this:

 $ node tests/compare_write_and_read.js --wclients 100 --rclients 200 \
 -name1 new_4b0948ddb3a428f8a5330e05745b2fbd4ccf9375 -name2
 old_cd214b23e8129868d4a7020ddafd55a16e496652 \
 -url1 http://localhost:5984/ -url2 http://localhost:5985/ \
 --duration 120

 With the following document template (file
 relaximation/common/small_doc.json):
 http://friendpaste.com/7GKUEg0SZHmOf0g7Dh5IWC

 Can anyone confirm these results?
 If confirmed, I would say this is a blocker for 1.1.0.

 thanks


 --
 Filipe David Manana,
 fdman...@gmail.com, fdman...@apache.org

 Reasonable men adapt themselves to the world.
 Unreasonable men adapt the world to themselves.
 That's why all progress depends on unreasonable men.






-- 
Filipe David Manana,
fdman...@gmail.com, fdman...@apache.org

Reasonable men adapt themselves to the world.
 Unreasonable men adapt the world to themselves.
 That's why all progress depends on unreasonable men.


[jira] Updated: (COUCHDB-984) Animated spinner icon has glitch

2010-12-13 Thread Sebastian Cohnen (JIRA)

 [ 
https://issues.apache.org/jira/browse/COUCHDB-984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebastian Cohnen updated COUCHDB-984:
-

Attachment: 16x16-Spinner.gif

 Animated spinner icon has glitch
 

 Key: COUCHDB-984
 URL: https://issues.apache.org/jira/browse/COUCHDB-984
 Project: CouchDB
  Issue Type: Bug
  Components: Futon
 Environment: any
Reporter: Nathan Vander Wilt
Priority: Minor
 Fix For: 1.0.2

 Attachments: 16x16-Spinner.gif

   Original Estimate: 0.5h
  Remaining Estimate: 0.5h

 Futon's progress spinner icon found in /share/couchdb/www/image/spinner.gif 
 (used when uploading files and perhaps elsewhere) suffers from the glitch 
 described at http://www.panic.com/blog/2010/10/spinner-rage/, where the fifth 
 frame of the animation flashes more darkly than the others.
 The Panic post on this issue provides a fixed version of the spinner, but it 
 is a Photoshop file:
 http://panic.com/blog/wp-content/files/16x16%20Spinner.psd.zip
 This simply needs to be re-exported by someone with a copy of Adobe's 
 software. (I have a LazyTwitter out on this, otherwise next week I can pester 
 the designers at work for a favor.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (COUCHDB-984) Animated spinner icon has glitch

2010-12-13 Thread Sebastian Cohnen (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12970858#action_12970858
 ] 

Sebastian Cohnen commented on COUCHDB-984:
--

Done :)

 Animated spinner icon has glitch
 

 Key: COUCHDB-984
 URL: https://issues.apache.org/jira/browse/COUCHDB-984
 Project: CouchDB
  Issue Type: Bug
  Components: Futon
 Environment: any
Reporter: Nathan Vander Wilt
Priority: Minor
 Fix For: 1.0.2

 Attachments: 16x16-Spinner.gif

   Original Estimate: 0.5h
  Remaining Estimate: 0.5h

 Futon's progress spinner icon found in /share/couchdb/www/image/spinner.gif 
 (used when uploading files and perhaps elsewhere) suffers from the glitch 
 described at http://www.panic.com/blog/2010/10/spinner-rage/, where the fifth 
 frame of the animation flashes more darkly than the others.
 The Panic post on this issue provides a fixed version of the spinner, but it 
 is a Photoshop file:
 http://panic.com/blog/wp-content/files/16x16%20Spinner.psd.zip
 This simply needs to be re-exported by someone with a copy of Adobe's 
 software. (I have a LazyTwitter out on this, otherwise next week I can pester 
 the designers at work for a favor.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (COUCHDB-984) Animated spinner icon has glitch

2010-12-13 Thread Paul Joseph Davis (JIRA)

 [ 
https://issues.apache.org/jira/browse/COUCHDB-984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Joseph Davis resolved COUCHDB-984.
---

Resolution: Fixed
  Assignee: Paul Joseph Davis

Fixed in trunk. Getting ready to back port to 1.1.x and 1.0.x as soon as I 
remember the commands.

 Animated spinner icon has glitch
 

 Key: COUCHDB-984
 URL: https://issues.apache.org/jira/browse/COUCHDB-984
 Project: CouchDB
  Issue Type: Bug
  Components: Futon
 Environment: any
Reporter: Nathan Vander Wilt
Assignee: Paul Joseph Davis
Priority: Minor
 Fix For: 1.0.2

 Attachments: 16x16-Spinner.gif

   Original Estimate: 0.5h
  Remaining Estimate: 0.5h

 Futon's progress spinner icon found in /share/couchdb/www/image/spinner.gif 
 (used when uploading files and perhaps elsewhere) suffers from the glitch 
 described at http://www.panic.com/blog/2010/10/spinner-rage/, where the fifth 
 frame of the animation flashes more darkly than the others.
 The Panic post on this issue provides a fixed version of the spinner, but it 
 is a Photoshop file:
 http://panic.com/blog/wp-content/files/16x16%20Spinner.psd.zip
 This simply needs to be re-exported by someone with a copy of Adobe's 
 software. (I have a LazyTwitter out on this, otherwise next week I can pester 
 the designers at work for a favor.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (COUCHDB-985) Update handlers can make documents with blank _id

2010-12-13 Thread Nick Fisher (JIRA)
Update handlers can make documents with blank _id
-

 Key: COUCHDB-985
 URL: https://issues.apache.org/jira/browse/COUCHDB-985
 Project: CouchDB
  Issue Type: Bug
  Components: Database Core, JavaScript View Server
Affects Versions: 1.0.1
 Environment: OS X, built using brew
Reporter: Nick Fisher
Priority: Minor


Make the following update handler:
'''
function(doc, req){
  return [{}, 'done\n'];
}
'''

Then do a POST to the handler. You will then have a document with a blank _id.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (COUCHDB-968) Duplicated IDs in _all_docs

2010-12-13 Thread Adam Kocoloski (JIRA)

 [ 
https://issues.apache.org/jira/browse/COUCHDB-968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Kocoloski updated COUCHDB-968:
---

Comment: was deleted

(was: Adam,

Thank you!

Rachel)

 Duplicated IDs in _all_docs
 ---

 Key: COUCHDB-968
 URL: https://issues.apache.org/jira/browse/COUCHDB-968
 Project: CouchDB
  Issue Type: Bug
  Components: Database Core
Affects Versions: 0.10.1, 0.10.2, 0.11.1, 0.11.2, 1.0, 1.0.1, 1.0.2
 Environment: any
Reporter: Sebastian Cohnen
Assignee: Adam Kocoloski
Priority: Blocker
 Fix For: 0.11.3, 1.0.2, 1.1


 We have a database, which is causing serious trouble with compaction and 
 replication (huge memory and cpu usage, often causing couchdb to crash b/c 
 all system memory is exhausted). Yesterday we discovered that db/_all_docs is 
 reporting duplicated IDs (see [1]). Until a few minutes ago we thought that 
 there are only few duplicates but today I took a closer look and I found 10 
 IDs which sum up to a total of 922 duplicates. Some of them have only 1 
 duplicate, others have hundreds.
 Some facts about the database in question:
 * ~13k documents, with 3-5k revs each
 * all duplicated documents are in conflict (with 1 up to 14 conflicts)
 * compaction is run on a daily bases
 * several thousands updates per hour
 * multi-master setup with pull replication from each other
 * delayed_commits=false on all nodes
 * used couchdb versions 1.0.0 and 1.0.x (*)
 Unfortunately the database's contents are confidential and I'm not allowed to 
 publish it.
 [1]: Part of http://localhost:5984/DBNAME/_all_docs
 ...
 {id:9997,key:9997,value:{rev:6096-603c68c1fa90ac3f56cf53771337ac9f}},
 {id:,key:,value:{rev:6097-3c873ccf6875ff3c4e2c6fa264c6a180}},
 {id:,key:,value:{rev:6097-3c873ccf6875ff3c4e2c6fa264c6a180}},
 ...
 [*]
 There were two (old) servers (1.0.0) in production (already having the 
 replication and compaction issues). Then two servers (1.0.x) were added and 
 replication was set up to bring them in sync with the old production servers 
 since the two new servers were meant to replace the old ones (to update 
 node.js application code among other things).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Reopened: (COUCHDB-968) Duplicated IDs in _all_docs

2010-12-13 Thread Adam Kocoloski (JIRA)

 [ 
https://issues.apache.org/jira/browse/COUCHDB-968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Kocoloski reopened COUCHDB-968:



It turns out this series of patches does not merge key trees correctly in all 
cases.  It wrongly assumes that the InsertTree is always a linear path.  Now, 
it is true that every invocation of couch_key_tree:merge/2 has a linear 
revision path in the 2nd argument.  However, when couch_key_tree:merge_one/4 
successfully merges the inserted revision path into one of the branches of an 
existing tree, creating a new Merged branch, it turns around and tries to 
merge that Merged branch into the next branch of the tree.  At this point, all 
bets are off -- the new InsertTree (a.k.a. Merged) is a full revision tree and 
can have an arbitrary number of siblings at each level.

I believe this commit addresses the issue:

https://github.com/kocolosk/couchdb/commit/a542113796653c6ff3673e05563fa20f041e6983

 Duplicated IDs in _all_docs
 ---

 Key: COUCHDB-968
 URL: https://issues.apache.org/jira/browse/COUCHDB-968
 Project: CouchDB
  Issue Type: Bug
  Components: Database Core
Affects Versions: 0.10.1, 0.10.2, 0.11.1, 0.11.2, 1.0, 1.0.1, 1.0.2
 Environment: any
Reporter: Sebastian Cohnen
Assignee: Adam Kocoloski
Priority: Blocker
 Fix For: 0.11.3, 1.0.2, 1.1


 We have a database, which is causing serious trouble with compaction and 
 replication (huge memory and cpu usage, often causing couchdb to crash b/c 
 all system memory is exhausted). Yesterday we discovered that db/_all_docs is 
 reporting duplicated IDs (see [1]). Until a few minutes ago we thought that 
 there are only few duplicates but today I took a closer look and I found 10 
 IDs which sum up to a total of 922 duplicates. Some of them have only 1 
 duplicate, others have hundreds.
 Some facts about the database in question:
 * ~13k documents, with 3-5k revs each
 * all duplicated documents are in conflict (with 1 up to 14 conflicts)
 * compaction is run on a daily bases
 * several thousands updates per hour
 * multi-master setup with pull replication from each other
 * delayed_commits=false on all nodes
 * used couchdb versions 1.0.0 and 1.0.x (*)
 Unfortunately the database's contents are confidential and I'm not allowed to 
 publish it.
 [1]: Part of http://localhost:5984/DBNAME/_all_docs
 ...
 {id:9997,key:9997,value:{rev:6096-603c68c1fa90ac3f56cf53771337ac9f}},
 {id:,key:,value:{rev:6097-3c873ccf6875ff3c4e2c6fa264c6a180}},
 {id:,key:,value:{rev:6097-3c873ccf6875ff3c4e2c6fa264c6a180}},
 ...
 [*]
 There were two (old) servers (1.0.0) in production (already having the 
 replication and compaction issues). Then two servers (1.0.x) were added and 
 replication was set up to bring them in sync with the old production servers 
 since the two new servers were meant to replace the old ones (to update 
 node.js application code among other things).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (COUCHDB-968) Duplicated IDs in _all_docs

2010-12-13 Thread Adam Kocoloski (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12971165#action_12971165
 ] 

Adam Kocoloski commented on COUCHDB-968:


Ugh, the deeper I look the more issues I find.  That commit is not the whole 
fix, because siblings can show up in the Place = 0 function clauses too.  I've 
added two commits to my original branch for this ticket:

https://github.com/kocolosk/couchdb/tree/968-duplicate-seq-entries-rebased

In these commits I'm relying on the condition that (length(Ours) =:= 1 or 
length(Insert) =:= 1), which I think is justified because we start with a 
single root in both Ours and Insert, and we only drill down into one of the 
trees.

You might recall that Damien's original code for the merge arranged the 
arguments to merge_at so that the the 3rd argument was always the tree that did 
not need to be drilled into.  That reduced the number of function clauses in 
merge_at, but it had the fatal flaw that, if the disk tree ended up in this 
position, the committed document body for a particular revision would be 
ignored in favor of saving a new copy of the same document body.  This was the 
original root cause of the dupes.

Clearly this is some really subtle stuff.  I might see if I can teach myself 
how to use QuickCheck Mini in time to have it hammer on this algorithm and look 
for other bugs.

 Duplicated IDs in _all_docs
 ---

 Key: COUCHDB-968
 URL: https://issues.apache.org/jira/browse/COUCHDB-968
 Project: CouchDB
  Issue Type: Bug
  Components: Database Core
Affects Versions: 0.10.1, 0.10.2, 0.11.1, 0.11.2, 1.0, 1.0.1, 1.0.2
 Environment: any
Reporter: Sebastian Cohnen
Assignee: Adam Kocoloski
Priority: Blocker
 Fix For: 0.11.3, 1.0.2, 1.1


 We have a database, which is causing serious trouble with compaction and 
 replication (huge memory and cpu usage, often causing couchdb to crash b/c 
 all system memory is exhausted). Yesterday we discovered that db/_all_docs is 
 reporting duplicated IDs (see [1]). Until a few minutes ago we thought that 
 there are only few duplicates but today I took a closer look and I found 10 
 IDs which sum up to a total of 922 duplicates. Some of them have only 1 
 duplicate, others have hundreds.
 Some facts about the database in question:
 * ~13k documents, with 3-5k revs each
 * all duplicated documents are in conflict (with 1 up to 14 conflicts)
 * compaction is run on a daily bases
 * several thousands updates per hour
 * multi-master setup with pull replication from each other
 * delayed_commits=false on all nodes
 * used couchdb versions 1.0.0 and 1.0.x (*)
 Unfortunately the database's contents are confidential and I'm not allowed to 
 publish it.
 [1]: Part of http://localhost:5984/DBNAME/_all_docs
 ...
 {id:9997,key:9997,value:{rev:6096-603c68c1fa90ac3f56cf53771337ac9f}},
 {id:,key:,value:{rev:6097-3c873ccf6875ff3c4e2c6fa264c6a180}},
 {id:,key:,value:{rev:6097-3c873ccf6875ff3c4e2c6fa264c6a180}},
 ...
 [*]
 There were two (old) servers (1.0.0) in production (already having the 
 replication and compaction issues). Then two servers (1.0.x) were added and 
 replication was set up to bring them in sync with the old production servers 
 since the two new servers were meant to replace the old ones (to update 
 node.js application code among other things).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.