Re: Managing Git identities?

2012-02-28 Thread Jason Smith
On Tue, Feb 28, 2012 at 5:02 AM, Robert Newson rnew...@apache.org wrote:
 for my part, I don't set user.email in my global .gitconfig because
 I've often committed with the wrong address. Leaving it undefined then
 gives you a warning when you commit. I then set the right local value
 and --amend --reset-author. Pretty sure our apache repo insists on
 apache.org addresses too.

keanu
Whoa.
/keanu

Adopted. Thanks.

-- 
Iris Couch


Re: [VOTE] Apache CouchDB 1.2.0 release, second round

2012-02-28 Thread Filipe David Manana
Jason, made some more tests with larger documents (template is
https://gist.github.com/1930804) and a different map function:

function(doc) {
   emit([doc.type, doc.category], doc.nested.coords);
}

(patch http://friendpaste.com/5C99aqXocN6N6H1BAYIigs)

Here's the results I got ( https://gist.github.com/1930807 )


Before COUCHDB-1186

fdmanana 23:21:05 ~/git/hub/slow_couchdb (master) docs=50
batch=5000 ./bench.sh wow.tpl
Server: CouchDB/1.2.0a-a68a792-git (Erlang OTP/R14B03)
{couchdb:Welcome,version:1.2.0a-a68a792-git}

[INFO] Created DB named `db1'
[INFO] Uploaded 5000 documents via _bulk_docs
()
[INFO] Uploaded 5000 documents via _bulk_docs
Building view.
{total_rows:50,offset:0,rows:[
{id:00051ef7-d735-48d7-9ba8-5a21a86e8d57,key:[dwarf,assassin],value:[{x:31227.35,y:31529.73},{x:116667.85,y:82008.25},{x:224.11,y:36652.41},{x:128565.95,y:6780.2},{x:165230.43,y:176208.63}]}
]}

real5m6.676s
user0m0.009s
sys 0m0.010s


After COUCHDB-1186

fdmanana 23:50:07 ~/git/hub/slow_couchdb (master) docs=50
batch=5000 ./bench.sh wow.tpl
Server: CouchDB/1.2.0a-f023052-git (Erlang OTP/R14B03)
{couchdb:Welcome,version:1.2.0a-f023052-git}

[INFO] Created DB named `db1'
[INFO] Uploaded 5000 documents via _bulk_docs
()
[INFO] Uploaded 5000 documents via _bulk_docs
Building view.
{total_rows:50,offset:0,rows:[
{id:00051ef7-d735-48d7-9ba8-5a21a86e8d57,key:[dwarf,assassin],value:[{x:31227.35,y:31529.73},{x:116667.85,y:82008.25},{x:224.11,y:36652.41},{x:128565.95,y:6780.2},{x:165230.43,y:176208.63}]}
]}

real5m1.395s
user0m0.008s
sys 0m0.010s


After COUCHDB-1186 + better queueing patch (
http://friendpaste.com/178nPFgfyyeGf2vtNRpL0w )

fdmanana 00:14:25 ~/git/hub/slow_couchdb (master) docs=50
batch=5000 ./bench.sh wow.tpl
Server: CouchDB/1.2.0a-f023052-git (Erlang OTP/R14B03)
{couchdb:Welcome,version:1.2.0a-f023052-git}

[INFO] Created DB named `db1'
[INFO] Uploaded 5000 documents via _bulk_docs
()
[INFO] Uploaded 5000 documents via _bulk_docs
Building view.
{total_rows:50,offset:0,rows:[
{id:00051ef7-d735-48d7-9ba8-5a21a86e8d57,key:[dwarf,assassin],value:[{x:31227.35,y:31529.73},{x:116667.85,y:82008.25},{x:224.11,y:36652.41},{x:128565.95,y:6780.2},{x:165230.43,y:176208.63}]}
]}

real4m48.175s
user0m0.008s
sys 0m0.009s


Disk model is APPLE SSD TS128C, quad core machine, 8Gb of ram.

Unfortunately I don't have access to the machine I used for the tests
in COUCHDB-1186 (spinning disk, Linux) before next week.


On Mon, Feb 27, 2012 at 7:49 PM, Paul Davis paul.joseph.da...@gmail.com wrote:
 On Mon, Feb 27, 2012 at 7:18 PM, Filipe David Manana
 fdman...@apache.org wrote:
 Jason, can't reproduce those results, not even close:

 http://friendpaste.com/1L4pHH8WQchaLIMCWhKX9Z

 Before COUCHDB-1186

 fdmanana 16:58:02 ~/git/hub/slow_couchdb (master) docs=50
 batch=5 ./bench.sh small_doc.tpl
 Server: CouchDB/1.2.0a-a68a792-git (Erlang OTP/R14B03)
 {couchdb:Welcome,version:1.2.0a-a68a792-git}

 [INFO] Created DB named `db1'
 [INFO] Uploaded 5 documents via _bulk_docs
 [INFO] Uploaded 5 documents via _bulk_docs
 [INFO] Uploaded 5 documents via _bulk_docs
 [INFO] Uploaded 5 documents via _bulk_docs
 [INFO] Uploaded 5 documents via _bulk_docs
 [INFO] Uploaded 5 documents via _bulk_docs
 [INFO] Uploaded 5 documents via _bulk_docs
 [INFO] Uploaded 5 documents via _bulk_docs
 [INFO] Uploaded 5 documents via _bulk_docs
 [INFO] Uploaded 5 documents via _bulk_docs
 Building view.
 {total_rows:50,offset:0,rows:[
 {id:doc1,key:1,value:1}
 ]}

 real    0m56.241s
 user    0m0.006s
 sys     0m0.005s


 After COUCHDB-1186

 fdmanana 17:02:02 ~/git/hub/slow_couchdb (master) docs=50
 batch=5 ./bench.sh small_doc.tpl
 Server: CouchDB/1.2.0a-f023052-git (Erlang OTP/R14B03)
 {couchdb:Welcome,version:1.2.0a-f023052-git}

 [INFO] Created DB named `db1'
 [INFO] Uploaded 5 documents via _bulk_docs
 [INFO] Uploaded 5 documents via _bulk_docs
 [INFO] Uploaded 5 documents via _bulk_docs
 [INFO] Uploaded 5 documents via _bulk_docs
 [INFO] Uploaded 5 documents via _bulk_docs
 [INFO] Uploaded 5 documents via _bulk_docs
 [INFO] Uploaded 5 documents via _bulk_docs
 [INFO] Uploaded 5 documents via _bulk_docs
 [INFO] Uploaded 5 documents via _bulk_docs
 [INFO] Uploaded 5 documents via _bulk_docs
 Building view.
 {total_rows:50,offset:0,rows:[
 {id:doc1,key:1,value:1}
 ]}

 real    1m11.694s
 user    0m0.006s
 sys     0m0.005s
 fdmanana 17:06:01 ~/git/hub/slow_couchdb (master)


 1.2.0a-f023052-git with patch
 http://friendpaste.com/178nPFgfyyeGf2vtNRpL0w  applied on top

 fdmanana 17:06:53 ~/git/hub/slow_couchdb (master) docs=50
 batch=5 ./bench.sh small_doc.tpl
 Server: CouchDB/1.2.0a-f023052-git (Erlang OTP/R14B03)
 {couchdb:Welcome,version:1.2.0a-f023052-git}

 [INFO] Created DB named `db1'
 [INFO] Uploaded 5 documents via _bulk_docs
 [INFO] Uploaded 5 documents via 

[jira] [Commented] (COUCHDB-1186) Speedups in the view indexer

2012-02-28 Thread Filipe Manana (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-1186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13218002#comment-13218002
 ] 

Filipe Manana commented on COUCHDB-1186:


My replies in the following development mailing list thread:

http://mail-archives.apache.org/mod_mbox/couchdb-dev/201202.mbox/%3CCA%2BY%2B4475J_wPbiC%3Dg2R6CcqUfQ-_V6TTTxV2iS4xTbz9a10%2BXw%40mail.gmail.com%3E



 Speedups in the view indexer
 

 Key: COUCHDB-1186
 URL: https://issues.apache.org/jira/browse/COUCHDB-1186
 Project: CouchDB
  Issue Type: Improvement
Reporter: Filipe Manana
Assignee: Filipe Manana
 Fix For: 1.2


 The patches at [1] and [2] do 2 distinct optimizations to the view indexer
 1) Use a NIF to implement couch_view:less_json/2;
 2) Multiple small optimizations to couch_view_updater - the main one is to 
 decode the view server's JSON only in the updater's write process, avoiding 2 
 EJSON term copying phases (couch_os_process - updater processes and writes 
 work queue)
 [1] - 
 https://github.com/fdmanana/couchdb/commit/3935a4a991abc32132c078e908dbc11925605602
 [2] - 
 https://github.com/fdmanana/couchdb/commit/cce325378723c863f05cca2192ac7bd58eedde1c
 Using these 2 patches, I've seen significant improvements to view generation 
 time. Here I present as example the databases at:
 A) http://fdmanana.couchone.com/indexer_test_2
 B) http://fdmanana.couchone.com/indexer_test_3
 ## Trunk
 ### database A
 $ time curl 
 http://localhost:5985/indexer_test_2/_design/test/_view/view1?limit=1
 {total_rows:1102400,offset:0,rows:[
 
 {id:00d49881-7bcf-4c3d-a65d-e44435eeb513,key:[dwarf,assassin,2,1.1],value:[{x:174347.18,y:127272.8},{x:35179.93,y:41550.55},{x:157014.38,y:172052.63},{x:116185.83,y:69871
.73},{x:153746.28,y:190006.59}]}
 ]}
 real  19m46.007s
 user  0m0.024s
 sys   0m0.020s
 ### Database B
 $ time curl 
 http://localhost:5985/indexer_test_3/_design/test/_view/view1?limit=1
 {total_rows:1102400,offset:0,rows:[
 
 {id:00d49881-7bcf-4c3d-a65d-e44435eeb513,key:[dwarf,assassin,2,1.1],value:[{x:174347.18,y:127272.8},{x:35179.93,y:41550.55},{x:157014.38,y:172052.63},{x:116185.83,y:69871
.73},{x:153746.28,y:190006.59}]}
 ]}
 real  21m41.958s
 user  0m0.004s
 sys   0m0.028s
 ## Trunk + the 2 patches
 ### Database A
   $ time curl 
 http://localhost:5984/indexer_test_2/_design/test/_view/view1?limit=1
   {total_rows:1102400,offset:0,rows:[
   
 {id:00d49881-7bcf-4c3d-a65d-e44435eeb513,key:[dwarf,assassin,2,1.1],value:[{x:174347.18,y:127272.8},{x:35179.93,y:41550.55},{x:157014.38,y:172052.63},{x:116185.83,y:69871.7
   3},{x:153746.28,y:190006.59}]}
   ]}
   real16m1.820s
   user0m0.000s
   sys 0m0.028s
   (versus 19m46 with trunk)
 ### Database B
   $ time curl 
 http://localhost:5984/indexer_test_3/_design/test/_view/view1?limit=1
   {total_rows:1102400,offset:0,rows:[
   
 {id:00d49881-7bcf-4c3d-a65d-e44435eeb513,key:[dwarf,assassin,2,1.1],value:[{x:174347.18,y:127272.8},{x:35179.93,y:41550.55},{x:157014.38,y:172052.63},{x:116185.83,y:69871.7
   3},{x:153746.28,y:190006.59}]}
   ]}
   real17m22.778s
   user0m0.020s
   sys 0m0.016s
   (versus 21m41s with trunk)
 Repeating these tests, always clearing my OS/fs cache before running them 
 (via `echo 3  /proc/sys/vm/drop_caches`), I always get about the same 
 relative differences.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: feasibility of a design doc option to use the ddoc new/ddoc id based protocol for map and reduce as well

2012-02-28 Thread Ronny Pfannschmidt

On 02/28/2012 04:09 AM, Jason Smith wrote:

On Tue, Feb 28, 2012 at 7:12 AM, Ronny Pfannschmidt
ronny.pfannschm...@gmx.de  wrote:

Hi,

while trying to build a a view server for ddocs that validate/support
documents as FSM (Finite State Machine)
i hit a inherent limit of the protocol,

map and reduce don't get the full ddoc, but only a part of it,
which means my view server can't actually work with the full ddoc
unless i do some weird hacks, and end up in a situation,
where i circumvent proper view invalidation

if map/reduce where allowed to opt in for using the newer protocol for
accessing functions,
my problem would go away

as for view invalidation, a simple variant could just use the _rev,
a more sophisticated one would take a hash of parts of the document
(using excludes/includes defined in options as well)


Hi, Ronny. Are you aware that the contents of .views.lib are sent to
the view server? At least with Javascript, the idea is that CommonJS
modules can go in there.

Maybe that can help as a workaround for now.



Hi Jason,

rather than just a workaround,
i would like to know the likelihood of accepting a patch that implements 
the view option + using _rev as invalidation hint


also i cant find docs on the protocol that's being used for exchanging 
CommonJS of views to the viewserver


-- Ronny


Re: feasibility of a design doc option to use the ddoc new/ddoc id based protocol for map and reduce as well

2012-02-28 Thread Alexander Shorin
Hi Ronny,

Invalidating views by ddoc _rev change is very bad idea - your 2M docs
database will have to be reindexed on each ddoc update: by adding
attachment or changing show function. Wait, what's the reason for
views to be invalidated in this case?


--
,,,^..^,,,



On Tue, Feb 28, 2012 at 12:45 PM, Ronny Pfannschmidt
ronny.pfannschm...@gmx.de wrote:
 On 02/28/2012 04:09 AM, Jason Smith wrote:

 On Tue, Feb 28, 2012 at 7:12 AM, Ronny Pfannschmidt
 ronny.pfannschm...@gmx.de  wrote:

 Hi,

 while trying to build a a view server for ddocs that validate/support
 documents as FSM (Finite State Machine)
 i hit a inherent limit of the protocol,

 map and reduce don't get the full ddoc, but only a part of it,
 which means my view server can't actually work with the full ddoc
 unless i do some weird hacks, and end up in a situation,
 where i circumvent proper view invalidation

 if map/reduce where allowed to opt in for using the newer protocol for
 accessing functions,
 my problem would go away

 as for view invalidation, a simple variant could just use the _rev,
 a more sophisticated one would take a hash of parts of the document
 (using excludes/includes defined in options as well)


 Hi, Ronny. Are you aware that the contents of .views.lib are sent to
 the view server? At least with Javascript, the idea is that CommonJS
 modules can go in there.

 Maybe that can help as a workaround for now.


 Hi Jason,

 rather than just a workaround,
 i would like to know the likelihood of accepting a patch that implements the
 view option + using _rev as invalidation hint

 also i cant find docs on the protocol that's being used for exchanging
 CommonJS of views to the viewserver

 -- Ronny


Re: [VOTE] Apache CouchDB 1.2.0 release, second round

2012-02-28 Thread Benoit Chesneau
On Tue, Feb 28, 2012 at 4:49 AM, Paul Davis paul.joseph.da...@gmail.com wrote

 Yeah, I've seen the btree behave quite differently on SSD's vs HDD's
 (same code had drastically different runtime characteristics).

 In other words, can we get a report of what type of disk everyone is running 
 on?

+ 1 .

We actually pollute this thread about vote, and the ticket about view
speedups which could be related or not :) Maybe we could open a ticket
to collect all the feedback and tests we have ?

Also noah, jan what is the status of this vote? Should we consider it
as aborted or paused?

- benoit


Re: feasibility of a design doc option to use the ddoc new/ddoc id based protocol for map and reduce as well

2012-02-28 Thread Jason Smith
On Tue, Feb 28, 2012 at 10:05 AM, Alexander Shorin kxe...@gmail.com wrote:
 Hi Ronny,

 Invalidating views by ddoc _rev change is very bad idea - your 2M docs
 database will have to be reindexed on each ddoc update: by adding
 attachment or changing show function. Wait, what's the reason for
 views to be invalidated in this case?

Ronny, please correct me if I am wrong.

But I think the reason is to allow using the *entire* design document
to help build views. If so, the _rev invalidation is one thing, but
changing CouchDB to send the entire ddoc will be a more substantial
change.

At any rate, this is why some example failing unit tests might clarify
the objective.

-- 
Iris Couch


Re: feasibility of a design doc option to use the ddoc new/ddoc id based protocol for map and reduce as well

2012-02-28 Thread Benoit Chesneau
On Tue, Feb 28, 2012 at 11:09 AM, Jason Smith j...@iriscouch.com wrote:
 On Tue, Feb 28, 2012 at 10:05 AM, Alexander Shorin kxe...@gmail.com wrote:
 Hi Ronny,

 Invalidating views by ddoc _rev change is very bad idea - your 2M docs
 database will have to be reindexed on each ddoc update: by adding
 attachment or changing show function. Wait, what's the reason for
 views to be invalidated in this case?

 Ronny, please correct me if I am wrong.

 But I think the reason is to allow using the *entire* design document
 to help build views. If so, the _rev invalidation is one thing, but
 changing CouchDB to send the entire ddoc will be a more substantial
 change.

 At any rate, this is why some example failing unit tests might clarify
 the objective.


why not adding a version property to your ddoc changes ?

- benoît


Re: feasibility of a design doc option to use the ddoc new/ddoc id based protocol for map and reduce as well

2012-02-28 Thread Ronny Pfannschmidt

On 02/28/2012 11:24 AM, Benoit Chesneau wrote:

On Tue, Feb 28, 2012 at 11:09 AM, Jason Smithj...@iriscouch.com  wrote:

On Tue, Feb 28, 2012 at 10:05 AM, Alexander Shorinkxe...@gmail.com  wrote:

Hi Ronny,

Invalidating views by ddoc _rev change is very bad idea - your 2M docs
database will have to be reindexed on each ddoc update: by adding
attachment or changing show function. Wait, what's the reason for
views to be invalidated in this case?


Ronny, please correct me if I am wrong.

But I think the reason is to allow using the *entire* design document
to help build views. If so, the _rev invalidation is one thing, but
changing CouchDB to send the entire ddoc will be a more substantial
change.

At any rate, this is why some example failing unit tests might clarify
the objective.



why not adding a version property to your ddoc changes ?



i started to realize, that a better workaround could actually just
put the data required for my viewservers view handling into the 
doc.views.libs attribute


then changes to that would automatically invalidate the views without 
breaking everything


i will investigate how to lay out my ddocs to get that behavior

-- Ronny



Re: [VOTE] Apache CouchDB 1.2.0 release, second round

2012-02-28 Thread Bob Dionne
Filipe,

This additional patch looks good, though I haven't tested it. Interesting 
comment about R15B, I did notice a difference with BigCouch in terms of some of 
the internal race conditions we see at times. Perhaps there are some 
performance changes relating to that. I also recently upgraded from the Macbook 
pro to a MBA so who knows.

I ran Jason and Bob's scripts a bit last night and saw similar slow downs 
between 1.1 and 1.2, though as reported elsewhere with larger docs it's less of 
an issue. In this patch[1] there's clearly a savings in avoiding the decode 
call, but I wonder how often that case obtains compared to the others. If {cmd, 
CMD} dominates then there is an additional overhead incurred however small it 
might be. Perhaps this explains why the benefits appear for larger docs only.

Anyway, just speculation from the code.

Regards,

Bob

[1] https://github.com/fdmanana/couchdb/commit/cce325378723c863f05cca21

On Feb 27, 2012, at 11:33 AM, Filipe David Manana wrote:

 I just tried Jason's script (modified it to use 500 000 docs instead
 of 50 000) against 1.2.x and 1.1.1, using OTP R14B03. Here's my
 results:
 
 1.2.x:
 
 $ port=5984 ./test.sh
 none
 Filling db.
 done
 HTTP/1.1 200 OK
 Server: CouchDB/1.2.0 (Erlang OTP/R14B03)
 Date: Mon, 27 Feb 2012 16:08:43 GMT
 Content-Type: text/plain; charset=utf-8
 Content-Length: 252
 Cache-Control: must-revalidate
 
 {db_name:db1,doc_count:51,doc_del_count:0,update_seq:51,purge_seq:0,compact_running:false,disk_size:130494577,data_size:130490673,instance_start_time:1330358830830086,disk_format_version:6,committed_update_seq:51}
 Building view.
 
 real  1m5.725s
 user  0m0.006s
 sys   0m0.005s
 done
 
 
 1.1.1:
 
 $ port=5984 ./test.sh
 
 Filling db.
 done
 HTTP/1.1 200 OK
 Server: CouchDB/1.1.2a785d32f-git (Erlang OTP/R14B03)
 Date: Mon, 27 Feb 2012 16:15:33 GMT
 Content-Type: text/plain;charset=utf-8
 Content-Length: 230
 Cache-Control: must-revalidate
 
 {db_name:db1,doc_count:51,doc_del_count:0,update_seq:51,purge_seq:0,compact_running:false,disk_size:122142818,instance_start_time:1330359233327316,disk_format_version:5,committed_update_seq:51}
 Building view.
 
 real  1m4.249s
 user  0m0.006s
 sys   0m0.005s
 done
 
 
 I don't see any significant difference there.
 
 Regarding COUCHDB-1186, the only thing that might cause some non
 determinism and affect performance is the queing/dequeing. Depending
 on timings, it's possible the writer is dequeing less items per
 dequeue operation and therefore inserting smaller batches into the
 btree. The following small change ensures larger batches (while still
 respecting the queue max size/item count):
 
 http://friendpaste.com/178nPFgfyyeGf2vtNRpL0w
 
 Running the test with this change:
 
 $ port=5984 ./test.sh
 none
 Filling db.
 done
 HTTP/1.1 200 OK
 Server: CouchDB/1.2.0 (Erlang OTP/R14B03)
 Date: Mon, 27 Feb 2012 16:23:20 GMT
 Content-Type: text/plain; charset=utf-8
 Content-Length: 252
 Cache-Control: must-revalidate
 
 {db_name:db1,doc_count:51,doc_del_count:0,update_seq:51,purge_seq:0,compact_running:false,disk_size:130494577,data_size:130490673,instance_start_time:1330359706846104,disk_format_version:6,committed_update_seq:51}
 Building view.
 
 real  0m49.762s
 user  0m0.006s
 sys   0m0.005s
 done
 
 
 If there's no objection, I'll push that patch.
 
 Also, another note, I noticed sometime ago that with master, using OTP
 R15B I got a performance drop of 10% to 15% compared to using master
 with OTP R14B04. Maybe it applies to 1.2.x as well.
 
 
 On Mon, Feb 27, 2012 at 5:33 AM, Robert Newson rnew...@apache.org wrote:
 Bob D, can you give more details on the data set you're testing?
 Number of docs, size/complexity of docs, etc? Basically, enough info
 that I could write a script to automate building an equivalent
 database.
 
 I wrote a quick bash script to make a database and time a view build
 here: http://friendpaste.com/7kBiKJn3uX1KiGJAFPv4nK
 
 B.
 
 On 27 February 2012 13:15, Jan Lehnardt j...@apache.org wrote:
 
 On Feb 27, 2012, at 12:58 , Bob Dionne wrote:
 
 Thanks for the clarification. I hope I'm not conflating things by 
 continuing the discussion here, I thought that's what you requested?
 
 The discussion we had on IRC was regarding collecting more data items for 
 the performance regression before we start to draw conclusions.
 
 My intention here is to understand what needs doing before we can release 
 1.2.0.
 
 I'll reply inline for the other issues.
 
 I just downloaded the release candidate again to start fresh. make 
 distcheck hangs on this step:
 
 /Users/bitdiddle/Downloads/apache-couchdb-1.2.0/apache-couchdb-1.2.0/_build/../test/etap/150-invalid-view-seq.t
  . 6/?
 
 Just stops completely. This is on R15B which has been rebuilt to use the 
 recommended older SSL version. I haven't looked into this crashing too 
 closely but I'm suspicious that I only see it with couchdb and never with 
 bigcouch and never using the 1.2.x 

Re: [VOTE] Apache CouchDB 1.2.0 release, second round

2012-02-28 Thread Benoit Chesneau
On Tue, Feb 28, 2012 at 11:05 AM, Benoit Chesneau bchesn...@gmail.com wrote:
 On Tue, Feb 28, 2012 at 4:49 AM, Paul Davis paul.joseph.da...@gmail.com 
 wrote

 Yeah, I've seen the btree behave quite differently on SSD's vs HDD's
 (same code had drastically different runtime characteristics).

 In other words, can we get a report of what type of disk everyone is running 
 on?

 + 1 .

 We actually pollute this thread about vote, and the ticket about view
 speedups which could be related or not :) Maybe we could open a ticket
 to collect all the feedback and tests we have ?

N / Y ?


Re: [VOTE] Apache CouchDB 1.2.0 release, second round

2012-02-28 Thread Robert Newson
I'm running my script on a EC2 node with spinning media, the numbers
come out the same for 1.1.1 vs 1.2. The only time I've seen a slowdown
with a scripted approach is my original one which didn't use bulk
docs. :/

B.

On 28 February 2012 11:33, Bob Dionne dio...@dionne-associates.com wrote:
 Filipe,

 This additional patch looks good, though I haven't tested it. Interesting 
 comment about R15B, I did notice a difference with BigCouch in terms of some 
 of the internal race conditions we see at times. Perhaps there are some 
 performance changes relating to that. I also recently upgraded from the 
 Macbook pro to a MBA so who knows.

 I ran Jason and Bob's scripts a bit last night and saw similar slow downs 
 between 1.1 and 1.2, though as reported elsewhere with larger docs it's less 
 of an issue. In this patch[1] there's clearly a savings in avoiding the 
 decode call, but I wonder how often that case obtains compared to the others. 
 If {cmd, CMD} dominates then there is an additional overhead incurred however 
 small it might be. Perhaps this explains why the benefits appear for larger 
 docs only.

 Anyway, just speculation from the code.

 Regards,

 Bob

 [1] https://github.com/fdmanana/couchdb/commit/cce325378723c863f05cca21

 On Feb 27, 2012, at 11:33 AM, Filipe David Manana wrote:

 I just tried Jason's script (modified it to use 500 000 docs instead
 of 50 000) against 1.2.x and 1.1.1, using OTP R14B03. Here's my
 results:

 1.2.x:

 $ port=5984 ./test.sh
 none
 Filling db.
 done
 HTTP/1.1 200 OK
 Server: CouchDB/1.2.0 (Erlang OTP/R14B03)
 Date: Mon, 27 Feb 2012 16:08:43 GMT
 Content-Type: text/plain; charset=utf-8
 Content-Length: 252
 Cache-Control: must-revalidate

 {db_name:db1,doc_count:51,doc_del_count:0,update_seq:51,purge_seq:0,compact_running:false,disk_size:130494577,data_size:130490673,instance_start_time:1330358830830086,disk_format_version:6,committed_update_seq:51}
 Building view.

 real  1m5.725s
 user  0m0.006s
 sys   0m0.005s
 done


 1.1.1:

 $ port=5984 ./test.sh
 
 Filling db.
 done
 HTTP/1.1 200 OK
 Server: CouchDB/1.1.2a785d32f-git (Erlang OTP/R14B03)
 Date: Mon, 27 Feb 2012 16:15:33 GMT
 Content-Type: text/plain;charset=utf-8
 Content-Length: 230
 Cache-Control: must-revalidate

 {db_name:db1,doc_count:51,doc_del_count:0,update_seq:51,purge_seq:0,compact_running:false,disk_size:122142818,instance_start_time:1330359233327316,disk_format_version:5,committed_update_seq:51}
 Building view.

 real  1m4.249s
 user  0m0.006s
 sys   0m0.005s
 done


 I don't see any significant difference there.

 Regarding COUCHDB-1186, the only thing that might cause some non
 determinism and affect performance is the queing/dequeing. Depending
 on timings, it's possible the writer is dequeing less items per
 dequeue operation and therefore inserting smaller batches into the
 btree. The following small change ensures larger batches (while still
 respecting the queue max size/item count):

 http://friendpaste.com/178nPFgfyyeGf2vtNRpL0w

 Running the test with this change:

 $ port=5984 ./test.sh
 none
 Filling db.
 done
 HTTP/1.1 200 OK
 Server: CouchDB/1.2.0 (Erlang OTP/R14B03)
 Date: Mon, 27 Feb 2012 16:23:20 GMT
 Content-Type: text/plain; charset=utf-8
 Content-Length: 252
 Cache-Control: must-revalidate

 {db_name:db1,doc_count:51,doc_del_count:0,update_seq:51,purge_seq:0,compact_running:false,disk_size:130494577,data_size:130490673,instance_start_time:1330359706846104,disk_format_version:6,committed_update_seq:51}
 Building view.

 real  0m49.762s
 user  0m0.006s
 sys   0m0.005s
 done


 If there's no objection, I'll push that patch.

 Also, another note, I noticed sometime ago that with master, using OTP
 R15B I got a performance drop of 10% to 15% compared to using master
 with OTP R14B04. Maybe it applies to 1.2.x as well.


 On Mon, Feb 27, 2012 at 5:33 AM, Robert Newson rnew...@apache.org wrote:
 Bob D, can you give more details on the data set you're testing?
 Number of docs, size/complexity of docs, etc? Basically, enough info
 that I could write a script to automate building an equivalent
 database.

 I wrote a quick bash script to make a database and time a view build
 here: http://friendpaste.com/7kBiKJn3uX1KiGJAFPv4nK

 B.

 On 27 February 2012 13:15, Jan Lehnardt j...@apache.org wrote:

 On Feb 27, 2012, at 12:58 , Bob Dionne wrote:

 Thanks for the clarification. I hope I'm not conflating things by 
 continuing the discussion here, I thought that's what you requested?

 The discussion we had on IRC was regarding collecting more data items for 
 the performance regression before we start to draw conclusions.

 My intention here is to understand what needs doing before we can release 
 1.2.0.

 I'll reply inline for the other issues.

 I just downloaded the release candidate again to start fresh. make 
 distcheck hangs on this step:

 

Please report your indexing speed

2012-02-28 Thread Jason Smith
Forgive the clean new thread. Hopefully it will not remain so.

If you can, would you please clone https://github.com/jhs/slow_couchdb

And build whatever Erlangs and CouchDB checkouts you see fit, and run
the test. For example:

docs=50 ./bench.sh small_doc.tpl

That should run the test and, God willing, upload the results to a
couch in the cloud. We should be able to use that information to
identify who you are, whether you are on SSD, what Erlang and Couch
build, and how fast it ran. Modulo bugs.


Re: [VOTE] Apache CouchDB 1.2.0 release, second round

2012-02-28 Thread Noah Slater
On Tue, Feb 28, 2012 at 10:05 AM, Benoit Chesneau bchesn...@gmail.comwrote:

 Also noah, jan what is the status of this vote? Should we consider it
 as aborted or paused?


As far as I can tell, we have not identified, for sure, a release blocking
issue. Once we are sure that there is a release blocking issue, I will
abort the vote. But I am not aborting it simply because it's taking a while
to get clear on the issues. :)


[jira] [Commented] (COUCHDB-1275) Futon's recent database list doesn't decode slashes in database names

2012-02-28 Thread Jan Lehnardt (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-1275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13218241#comment-13218241
 ] 

Jan Lehnardt commented on COUCHDB-1275:
---

yeah :)

 Futon's recent database list doesn't decode slashes in database names
 -

 Key: COUCHDB-1275
 URL: https://issues.apache.org/jira/browse/COUCHDB-1275
 Project: CouchDB
  Issue Type: Bug
  Components: Futon
Affects Versions: 1.1
Reporter: Jan Lehnardt
Priority: Minor

 Create a database with a slash in it, futon will go to the database view 
 automatically and add it to the recent databases list. the list will display 
 the encoded %2f instead of the /
 Here's a quick fix: http://friendpaste.com/1WORPAfSY5MUyoisaAQtZB
 I tested it for XSS but I may have overlooked something and I'd appreciate a 
 review.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Please report your indexing speed

2012-02-28 Thread Filipe David Manana
Jason, repeated my last test with the 1Kb docs (
https://gist.github.com/1930804, map function
http://friendpaste.com/5C99aqXocN6N6H1BAYIigs ) to cover branch 1.1.x
as well. Here are the full results (also in
https://gist.github.com/1930807):


Before COUCHDB-1186

fdmanana 23:21:05 ~/git/hub/slow_couchdb (master) docs=50
batch=5000 ./bench.sh wow.tpl
Server: CouchDB/1.2.0a-a68a792-git (Erlang OTP/R14B03)
{couchdb:Welcome,version:1.2.0a-a68a792-git}

[INFO] Created DB named `db1'
[INFO] Uploaded 5000 documents via _bulk_docs
()
[INFO] Uploaded 5000 documents via _bulk_docs
Building view.
{total_rows:50,offset:0,rows:[
{id:00051ef7-d735-48d7-9ba8-5a21a86e8d57,key:[dwarf,assassin],value:[{x:31227.35,y:31529.73},{x:116667.85,y:82008.25},{x:224.11,y:36652.41},{x:128565.95,y:6780.2},{x:165230.43,y:176208.63}]}
]}

real5m6.676s
user0m0.009s
sys 0m0.010s


After COUCHDB-1186

fdmanana 23:50:07 ~/git/hub/slow_couchdb (master) docs=50
batch=5000 ./bench.sh wow.tpl
Server: CouchDB/1.2.0a-f023052-git (Erlang OTP/R14B03)
{couchdb:Welcome,version:1.2.0a-f023052-git}

[INFO] Created DB named `db1'
[INFO] Uploaded 5000 documents via _bulk_docs
()
[INFO] Uploaded 5000 documents via _bulk_docs
Building view.
{total_rows:50,offset:0,rows:[
{id:00051ef7-d735-48d7-9ba8-5a21a86e8d57,key:[dwarf,assassin],value:[{x:31227.35,y:31529.73},{x:116667.85,y:82008.25},{x:224.11,y:36652.41},{x:128565.95,y:6780.2},{x:165230.43,y:176208.63}]}
]}

real5m1.395s
user0m0.008s
sys 0m0.010s


After COUCHDB-1186 + better queueing patch
(http://friendpaste.com/178nPFgfyyeGf2vtNRpL0w)

fdmanana 00:14:25 ~/git/hub/slow_couchdb (master) docs=50
batch=5000 ./bench.sh wow.tpl
Server: CouchDB/1.2.0a-f023052-git (Erlang OTP/R14B03)
{couchdb:Welcome,version:1.2.0a-f023052-git}

[INFO] Created DB named `db1'
[INFO] Uploaded 5000 documents via _bulk_docs
()
[INFO] Uploaded 5000 documents via _bulk_docs
Building view.
{total_rows:50,offset:0,rows:[
{id:00051ef7-d735-48d7-9ba8-5a21a86e8d57,key:[dwarf,assassin],value:[{x:31227.35,y:31529.73},{x:116667.85,y:82008.25},{x:224.11,y:36652.41},{x:128565.95,y:6780.2},{x:165230.43,y:176208.63}]}
]}

real4m48.175s
user0m0.008s
sys 0m0.009s


CouchDB branch 1.1.x

fdmanana 08:16:58 ~/git/hub/slow_couchdb (master) docs=50
batch=5000 ./bench.sh wow.tpl
Server: CouchDB/1.1.2a785d32f-git (Erlang OTP/R14B03)
{couchdb:Welcome,version:1.1.2a785d32f-git}

[INFO] Created DB named `db1'
[INFO] Uploaded 5000 documents via _bulk_docs
()
[INFO] Uploaded 5000 documents via _bulk_docs
Building view.
{total_rows:50,offset:0,rows:[
{id:0001c0a1-edcb-4dbc-aa9d-533c73d980cb,key:[dwarf,assassin],value:[{x:62038.32,y:105825.29},{x:90713.13,y:128570.97},{x:43836.37,y:80517.12},{x:71610.97,y:143739.99},{x:86038.39,y:84731.8}]}
]}

real5m44.374s
user0m0.008s
sys 0m0.010s


Disk model APPLE SSD TS128C, quad core machine, 8Gb of ram.



On Tue, Feb 28, 2012 at 5:17 AM, Jason Smith j...@apache.org wrote:
 Forgive the clean new thread. Hopefully it will not remain so.

 If you can, would you please clone https://github.com/jhs/slow_couchdb

 And build whatever Erlangs and CouchDB checkouts you see fit, and run
 the test. For example:

    docs=50 ./bench.sh small_doc.tpl

 That should run the test and, God willing, upload the results to a
 couch in the cloud. We should be able to use that information to
 identify who you are, whether you are on SSD, what Erlang and Couch
 build, and how fast it ran. Modulo bugs.



-- 
Filipe David Manana,

Reasonable men adapt themselves to the world.
 Unreasonable men adapt the world to themselves.
 That's why all progress depends on unreasonable men.


Re: Please report your indexing speed

2012-02-28 Thread Jan Lehnardt

# tl;dr:

bench_R14B04_1.1.1_default_doc.tpl.log 0m18.749s
bench_R14B04_1.2.x_default_doc.tpl.log 0m16.304s
bench_R15B_1.1.1_default_doc.tpl.log   0m12.946s
bench_R15B_1.2.x_default_doc.tpl.log   0m13.616s

bench_R14B04_1.1.1_nested_6k.tpl.log   1m27.267s
bench_R14B04_1.2.x_nested_6k.tpl.log   0m37.910s
bench_R15B_1.1.1_nested_6k.tpl.log 0m46.963s
bench_R15B_1.2.x_nested_6k.tpl.log 0m33.011s

bench_R14B04_1.1.1_small_doc.tpl.log   1m17.212s
bench_R14B04_1.2.x_small_doc.tpl.log   1m41.383s
bench_R15B_1.1.1_small_doc.tpl.log 0m52.858s
bench_R15B_1.2.x_small_doc.tpl.log 1m9.043s

bench_R14B04_1.1.1_wow.tpl.log 0m29.842s
bench_R14B04_1.2.x_wow.tpl.log 0m24.178s
bench_R15B_1.1.1_wow.tpl.log   0m20.493s
bench_R15B_1.2.x_wow.tpl.log   0m19.584s

(Full logs at [5])


# Description

All of these are on Mac OS X 10.7.3 on an SSD.

I'll be running the same set on spinning disk and then Robert N asked
me to populate the DBs not using builk docs. Since that's gonna take
a while, I'll probably run this overnight.

All of the results are generated by my fork of Jason's slow_couchdb[1]
and Filipe's seatoncouch[2].

The changes I've made is have the small_doc test run with 500k instead
of 50k docs, added .view files to match the tpl files in
seatoncouch/templates/* so we can have similar views use the different
doc structures.

I also added two scripts to orchestrate the above testing in a more
automated fashion. It also allows you to run the full matrix yourself.
All you need is set up homebrew allow `brew switch erlang R14B04` and
R15B (which is controlled in matrix.sh[3]) and have a git checkout of the
CouchDB sources that allow you to do `git checkout 1.1.1` or `1.2.x`
(which is controlled in runner.sh[4], adjust the path to the git checkout
there as well).

matrix.sh also allows you to specify which docs to run.

Please shout if you need any more info about this test run or how to
run this yourself.


# Analysis

Inconclusive, I'l like to run this on larger dbs in general to see if
there are more differences that shake out and I've yet have to run this
on a spinning disk let alone another OS* or more complex view functions
or larger design docs (like the one Stefan had).

* It shouldn't be too much work to port slow_couchdb to other OSs, I'll
definitely be looking into that, but we can do with every bit of help :)

So far, I'm happy to conclude that while there are definitely provable
differences, that we can live with them.

Cheers
Jan
-- 


[1]: https://github.com/janl/slow_couchdb
[2]: https://github.com/janl/seatoncouch
[3]: https://github.com/janl/slow_couchdb/blob/master/matrix.sh
[4]: https://github.com/janl/slow_couchdb/blob/master/runner.sh
[5]: http://jan.prima.de/slow_couch/ssd/


On Feb 28, 2012, at 18:53 , Filipe David Manana wrote:

 Jason, repeated my last test with the 1Kb docs (
 https://gist.github.com/1930804, map function
 http://friendpaste.com/5C99aqXocN6N6H1BAYIigs ) to cover branch 1.1.x
 as well. Here are the full results (also in
 https://gist.github.com/1930807):
 
 
 Before COUCHDB-1186
 
 fdmanana 23:21:05 ~/git/hub/slow_couchdb (master) docs=50
 batch=5000 ./bench.sh wow.tpl
 Server: CouchDB/1.2.0a-a68a792-git (Erlang OTP/R14B03)
 {couchdb:Welcome,version:1.2.0a-a68a792-git}
 
 [INFO] Created DB named `db1'
 [INFO] Uploaded 5000 documents via _bulk_docs
 ()
 [INFO] Uploaded 5000 documents via _bulk_docs
 Building view.
 {total_rows:50,offset:0,rows:[
 {id:00051ef7-d735-48d7-9ba8-5a21a86e8d57,key:[dwarf,assassin],value:[{x:31227.35,y:31529.73},{x:116667.85,y:82008.25},{x:224.11,y:36652.41},{x:128565.95,y:6780.2},{x:165230.43,y:176208.63}]}
 ]}
 
 real  5m6.676s
 user  0m0.009s
 sys   0m0.010s
 
 
 After COUCHDB-1186
 
 fdmanana 23:50:07 ~/git/hub/slow_couchdb (master) docs=50
 batch=5000 ./bench.sh wow.tpl
 Server: CouchDB/1.2.0a-f023052-git (Erlang OTP/R14B03)
 {couchdb:Welcome,version:1.2.0a-f023052-git}
 
 [INFO] Created DB named `db1'
 [INFO] Uploaded 5000 documents via _bulk_docs
 ()
 [INFO] Uploaded 5000 documents via _bulk_docs
 Building view.
 {total_rows:50,offset:0,rows:[
 {id:00051ef7-d735-48d7-9ba8-5a21a86e8d57,key:[dwarf,assassin],value:[{x:31227.35,y:31529.73},{x:116667.85,y:82008.25},{x:224.11,y:36652.41},{x:128565.95,y:6780.2},{x:165230.43,y:176208.63}]}
 ]}
 
 real  5m1.395s
 user  0m0.008s
 sys   0m0.010s
 
 
 After COUCHDB-1186 + better queueing patch
 (http://friendpaste.com/178nPFgfyyeGf2vtNRpL0w)
 
 fdmanana 00:14:25 ~/git/hub/slow_couchdb (master) docs=50
 batch=5000 ./bench.sh wow.tpl
 Server: CouchDB/1.2.0a-f023052-git (Erlang OTP/R14B03)
 {couchdb:Welcome,version:1.2.0a-f023052-git}
 
 [INFO] Created DB named `db1'
 [INFO] Uploaded 5000 documents via _bulk_docs
 ()
 [INFO] Uploaded 5000 documents via _bulk_docs
 Building view.
 {total_rows:50,offset:0,rows:[
 

Re: Please report your indexing speed

2012-02-28 Thread Jan Lehnardt
Same story, but spinning disk, 5400rpm:

bench_R14B04_1.1.1_default_doc.tpl.log 0m19.175s
bench_R14B04_1.2.x_default_doc.tpl.log 0m16.821s
bench_R15B_1.1.1_default_doc.tpl.log   0m13.050s
bench_R15B_1.2.x_default_doc.tpl.log   0m13.292s

bench_R14B04_1.1.1_nested_6k.tpl.log   1m26.941s
bench_R14B04_1.2.x_nested_6k.tpl.log   0m39.178s
bench_R15B_1.1.1_nested_6k.tpl.log 0m47.766s
bench_R15B_1.2.x_nested_6k.tpl.log 0m31.697s

bench_R14B04_1.1.1_small_doc.tpl.log   1m19.851s
bench_R14B04_1.2.x_small_doc.tpl.log   1m43.057s
bench_R15B_1.1.1_small_doc.tpl.log 0m52.249s
bench_R15B_1.2.x_small_doc.tpl.log 1m8.195s

bench_R14B04_1.1.1_wow.tpl.log 0m29.589s
bench_R14B04_1.2.x_wow.tpl.log 0m24.867s
bench_R15B_1.1.1_wow.tpl.log   0m20.171s
bench_R15B_1.2.x_wow.tpl.log   0m18.800s

Full logs at http://jan.prima.de/slow_couch/rust/

Cheers
Jan
-- 


On Feb 28, 2012, at 21:22 , Jan Lehnardt wrote:

 
 # tl;dr:
 
 bench_R14B04_1.1.1_default_doc.tpl.log 0m18.749s
 bench_R14B04_1.2.x_default_doc.tpl.log 0m16.304s
 bench_R15B_1.1.1_default_doc.tpl.log   0m12.946s
 bench_R15B_1.2.x_default_doc.tpl.log   0m13.616s
 
 bench_R14B04_1.1.1_nested_6k.tpl.log   1m27.267s
 bench_R14B04_1.2.x_nested_6k.tpl.log   0m37.910s
 bench_R15B_1.1.1_nested_6k.tpl.log 0m46.963s
 bench_R15B_1.2.x_nested_6k.tpl.log 0m33.011s
 
 bench_R14B04_1.1.1_small_doc.tpl.log   1m17.212s
 bench_R14B04_1.2.x_small_doc.tpl.log   1m41.383s
 bench_R15B_1.1.1_small_doc.tpl.log 0m52.858s
 bench_R15B_1.2.x_small_doc.tpl.log 1m9.043s
 
 bench_R14B04_1.1.1_wow.tpl.log 0m29.842s
 bench_R14B04_1.2.x_wow.tpl.log 0m24.178s
 bench_R15B_1.1.1_wow.tpl.log   0m20.493s
 bench_R15B_1.2.x_wow.tpl.log   0m19.584s
 
 (Full logs at [5])
 
 
 # Description
 
 All of these are on Mac OS X 10.7.3 on an SSD.
 
 I'll be running the same set on spinning disk and then Robert N asked
 me to populate the DBs not using builk docs. Since that's gonna take
 a while, I'll probably run this overnight.
 
 All of the results are generated by my fork of Jason's slow_couchdb[1]
 and Filipe's seatoncouch[2].
 
 The changes I've made is have the small_doc test run with 500k instead
 of 50k docs, added .view files to match the tpl files in
 seatoncouch/templates/* so we can have similar views use the different
 doc structures.
 
 I also added two scripts to orchestrate the above testing in a more
 automated fashion. It also allows you to run the full matrix yourself.
 All you need is set up homebrew allow `brew switch erlang R14B04` and
 R15B (which is controlled in matrix.sh[3]) and have a git checkout of the
 CouchDB sources that allow you to do `git checkout 1.1.1` or `1.2.x`
 (which is controlled in runner.sh[4], adjust the path to the git checkout
 there as well).
 
 matrix.sh also allows you to specify which docs to run.
 
 Please shout if you need any more info about this test run or how to
 run this yourself.
 
 
 # Analysis
 
 Inconclusive, I'l like to run this on larger dbs in general to see if
 there are more differences that shake out and I've yet have to run this
 on a spinning disk let alone another OS* or more complex view functions
 or larger design docs (like the one Stefan had).
 
 * It shouldn't be too much work to port slow_couchdb to other OSs, I'll
 definitely be looking into that, but we can do with every bit of help :)
 
 So far, I'm happy to conclude that while there are definitely provable
 differences, that we can live with them.
 
 Cheers
 Jan
 -- 
 
 
 [1]: https://github.com/janl/slow_couchdb
 [2]: https://github.com/janl/seatoncouch
 [3]: https://github.com/janl/slow_couchdb/blob/master/matrix.sh
 [4]: https://github.com/janl/slow_couchdb/blob/master/runner.sh
 [5]: http://jan.prima.de/slow_couch/ssd/
 
 
 On Feb 28, 2012, at 18:53 , Filipe David Manana wrote:
 
 Jason, repeated my last test with the 1Kb docs (
 https://gist.github.com/1930804, map function
 http://friendpaste.com/5C99aqXocN6N6H1BAYIigs ) to cover branch 1.1.x
 as well. Here are the full results (also in
 https://gist.github.com/1930807):
 
 
 Before COUCHDB-1186
 
 fdmanana 23:21:05 ~/git/hub/slow_couchdb (master) docs=50
 batch=5000 ./bench.sh wow.tpl
 Server: CouchDB/1.2.0a-a68a792-git (Erlang OTP/R14B03)
 {couchdb:Welcome,version:1.2.0a-a68a792-git}
 
 [INFO] Created DB named `db1'
 [INFO] Uploaded 5000 documents via _bulk_docs
 ()
 [INFO] Uploaded 5000 documents via _bulk_docs
 Building view.
 {total_rows:50,offset:0,rows:[
 {id:00051ef7-d735-48d7-9ba8-5a21a86e8d57,key:[dwarf,assassin],value:[{x:31227.35,y:31529.73},{x:116667.85,y:82008.25},{x:224.11,y:36652.41},{x:128565.95,y:6780.2},{x:165230.43,y:176208.63}]}
 ]}
 
 real 5m6.676s
 user 0m0.009s
 sys  0m0.010s
 
 
 After COUCHDB-1186
 
 fdmanana 23:50:07 ~/git/hub/slow_couchdb (master) docs=50
 batch=5000 

Re: Please report your indexing speed

2012-02-28 Thread Jan Lehnardt
For Robert Newson, avoiding bulk inserts to populate the dbs:

bench_R14B04_1.1.1_default_doc.tpl.log 0m19.692s
bench_R14B04_1.2.x_default_doc.tpl.log 0m17.033s

bench_R14B04_1.1.1_nested_6k.tpl.log   1m31.393s
bench_R14B04_1.2.x_nested_6k.tpl.log   0m42.010s

bench_R14B04_1.1.1_small_doc.tpl.log   0m8.103s
bench_R14B04_1.2.x_small_doc.tpl.log   0m10.597s

bench_R14B04_1.1.1_wow.tpl.log 0m33.944s
bench_R14B04_1.2.x_wow.tpl.log 0m27.087s

(Just R14B04, full logs available on demand)

Cheers
Jan
-- 


On Feb 28, 2012, at 23:09 , Jan Lehnardt wrote:

 Same story, but spinning disk, 5400rpm:
 
 bench_R14B04_1.1.1_default_doc.tpl.log 0m19.175s
 bench_R14B04_1.2.x_default_doc.tpl.log 0m16.821s
 bench_R15B_1.1.1_default_doc.tpl.log   0m13.050s
 bench_R15B_1.2.x_default_doc.tpl.log   0m13.292s
 
 bench_R14B04_1.1.1_nested_6k.tpl.log   1m26.941s
 bench_R14B04_1.2.x_nested_6k.tpl.log   0m39.178s
 bench_R15B_1.1.1_nested_6k.tpl.log 0m47.766s
 bench_R15B_1.2.x_nested_6k.tpl.log 0m31.697s
 
 bench_R14B04_1.1.1_small_doc.tpl.log   1m19.851s
 bench_R14B04_1.2.x_small_doc.tpl.log   1m43.057s
 bench_R15B_1.1.1_small_doc.tpl.log 0m52.249s
 bench_R15B_1.2.x_small_doc.tpl.log 1m8.195s
 
 bench_R14B04_1.1.1_wow.tpl.log 0m29.589s
 bench_R14B04_1.2.x_wow.tpl.log 0m24.867s
 bench_R15B_1.1.1_wow.tpl.log   0m20.171s
 bench_R15B_1.2.x_wow.tpl.log   0m18.800s
 
 Full logs at http://jan.prima.de/slow_couch/rust/
 
 Cheers
 Jan
 -- 
 
 
 On Feb 28, 2012, at 21:22 , Jan Lehnardt wrote:
 
 
 # tl;dr:
 
 bench_R14B04_1.1.1_default_doc.tpl.log 0m18.749s
 bench_R14B04_1.2.x_default_doc.tpl.log 0m16.304s
 bench_R15B_1.1.1_default_doc.tpl.log   0m12.946s
 bench_R15B_1.2.x_default_doc.tpl.log   0m13.616s
 
 bench_R14B04_1.1.1_nested_6k.tpl.log   1m27.267s
 bench_R14B04_1.2.x_nested_6k.tpl.log   0m37.910s
 bench_R15B_1.1.1_nested_6k.tpl.log 0m46.963s
 bench_R15B_1.2.x_nested_6k.tpl.log 0m33.011s
 
 bench_R14B04_1.1.1_small_doc.tpl.log   1m17.212s
 bench_R14B04_1.2.x_small_doc.tpl.log   1m41.383s
 bench_R15B_1.1.1_small_doc.tpl.log 0m52.858s
 bench_R15B_1.2.x_small_doc.tpl.log 1m9.043s
 
 bench_R14B04_1.1.1_wow.tpl.log 0m29.842s
 bench_R14B04_1.2.x_wow.tpl.log 0m24.178s
 bench_R15B_1.1.1_wow.tpl.log   0m20.493s
 bench_R15B_1.2.x_wow.tpl.log   0m19.584s
 
 (Full logs at [5])
 
 
 # Description
 
 All of these are on Mac OS X 10.7.3 on an SSD.
 
 I'll be running the same set on spinning disk and then Robert N asked
 me to populate the DBs not using builk docs. Since that's gonna take
 a while, I'll probably run this overnight.
 
 All of the results are generated by my fork of Jason's slow_couchdb[1]
 and Filipe's seatoncouch[2].
 
 The changes I've made is have the small_doc test run with 500k instead
 of 50k docs, added .view files to match the tpl files in
 seatoncouch/templates/* so we can have similar views use the different
 doc structures.
 
 I also added two scripts to orchestrate the above testing in a more
 automated fashion. It also allows you to run the full matrix yourself.
 All you need is set up homebrew allow `brew switch erlang R14B04` and
 R15B (which is controlled in matrix.sh[3]) and have a git checkout of the
 CouchDB sources that allow you to do `git checkout 1.1.1` or `1.2.x`
 (which is controlled in runner.sh[4], adjust the path to the git checkout
 there as well).
 
 matrix.sh also allows you to specify which docs to run.
 
 Please shout if you need any more info about this test run or how to
 run this yourself.
 
 
 # Analysis
 
 Inconclusive, I'l like to run this on larger dbs in general to see if
 there are more differences that shake out and I've yet have to run this
 on a spinning disk let alone another OS* or more complex view functions
 or larger design docs (like the one Stefan had).
 
 * It shouldn't be too much work to port slow_couchdb to other OSs, I'll
 definitely be looking into that, but we can do with every bit of help :)
 
 So far, I'm happy to conclude that while there are definitely provable
 differences, that we can live with them.
 
 Cheers
 Jan
 -- 
 
 
 [1]: https://github.com/janl/slow_couchdb
 [2]: https://github.com/janl/seatoncouch
 [3]: https://github.com/janl/slow_couchdb/blob/master/matrix.sh
 [4]: https://github.com/janl/slow_couchdb/blob/master/runner.sh
 [5]: http://jan.prima.de/slow_couch/ssd/
 
 
 On Feb 28, 2012, at 18:53 , Filipe David Manana wrote:
 
 Jason, repeated my last test with the 1Kb docs (
 https://gist.github.com/1930804, map function
 http://friendpaste.com/5C99aqXocN6N6H1BAYIigs ) to cover branch 1.1.x
 as well. Here are the full results (also in
 https://gist.github.com/1930807):
 
 
 Before COUCHDB-1186
 
 fdmanana 23:21:05 ~/git/hub/slow_couchdb (master) docs=50
 batch=5000 ./bench.sh wow.tpl
 

Re: Please report your indexing speed

2012-02-28 Thread Jan Lehnardt
One more report.

I got suspicious of the rather short runtimes, so I picked the
default_doc and ran it at 500k:

bench_R14B04_1.1.1_default_doc.tpl.log 2m19.139s
bench_R14B04_1.2.x_default_doc.tpl.log 2m18.875s

It seems to me that we need more variation in what we test,
more OSs, larger ddocs, like the one Stefan linked to. Can
anyone help providing this?

Cheers
Jan
-- 


On Feb 29, 2012, at 00:23 , Jan Lehnardt wrote:

 For Robert Newson, avoiding bulk inserts to populate the dbs:
 
 bench_R14B04_1.1.1_default_doc.tpl.log 0m19.692s
 bench_R14B04_1.2.x_default_doc.tpl.log 0m17.033s
 
 bench_R14B04_1.1.1_nested_6k.tpl.log   1m31.393s
 bench_R14B04_1.2.x_nested_6k.tpl.log   0m42.010s
 
 bench_R14B04_1.1.1_small_doc.tpl.log   0m8.103s
 bench_R14B04_1.2.x_small_doc.tpl.log   0m10.597s
 
 bench_R14B04_1.1.1_wow.tpl.log 0m33.944s
 bench_R14B04_1.2.x_wow.tpl.log 0m27.087s
 
 (Just R14B04, full logs available on demand)
 
 Cheers
 Jan
 -- 
 
 
 On Feb 28, 2012, at 23:09 , Jan Lehnardt wrote:
 
 Same story, but spinning disk, 5400rpm:
 
 bench_R14B04_1.1.1_default_doc.tpl.log 0m19.175s
 bench_R14B04_1.2.x_default_doc.tpl.log 0m16.821s
 bench_R15B_1.1.1_default_doc.tpl.log   0m13.050s
 bench_R15B_1.2.x_default_doc.tpl.log   0m13.292s
 
 bench_R14B04_1.1.1_nested_6k.tpl.log   1m26.941s
 bench_R14B04_1.2.x_nested_6k.tpl.log   0m39.178s
 bench_R15B_1.1.1_nested_6k.tpl.log 0m47.766s
 bench_R15B_1.2.x_nested_6k.tpl.log 0m31.697s
 
 bench_R14B04_1.1.1_small_doc.tpl.log   1m19.851s
 bench_R14B04_1.2.x_small_doc.tpl.log   1m43.057s
 bench_R15B_1.1.1_small_doc.tpl.log 0m52.249s
 bench_R15B_1.2.x_small_doc.tpl.log 1m8.195s
 
 bench_R14B04_1.1.1_wow.tpl.log 0m29.589s
 bench_R14B04_1.2.x_wow.tpl.log 0m24.867s
 bench_R15B_1.1.1_wow.tpl.log   0m20.171s
 bench_R15B_1.2.x_wow.tpl.log   0m18.800s
 
 Full logs at http://jan.prima.de/slow_couch/rust/
 
 Cheers
 Jan
 -- 
 
 
 On Feb 28, 2012, at 21:22 , Jan Lehnardt wrote:
 
 
 # tl;dr:
 
 bench_R14B04_1.1.1_default_doc.tpl.log 0m18.749s
 bench_R14B04_1.2.x_default_doc.tpl.log 0m16.304s
 bench_R15B_1.1.1_default_doc.tpl.log   0m12.946s
 bench_R15B_1.2.x_default_doc.tpl.log   0m13.616s
 
 bench_R14B04_1.1.1_nested_6k.tpl.log   1m27.267s
 bench_R14B04_1.2.x_nested_6k.tpl.log   0m37.910s
 bench_R15B_1.1.1_nested_6k.tpl.log 0m46.963s
 bench_R15B_1.2.x_nested_6k.tpl.log 0m33.011s
 
 bench_R14B04_1.1.1_small_doc.tpl.log   1m17.212s
 bench_R14B04_1.2.x_small_doc.tpl.log   1m41.383s
 bench_R15B_1.1.1_small_doc.tpl.log 0m52.858s
 bench_R15B_1.2.x_small_doc.tpl.log 1m9.043s
 
 bench_R14B04_1.1.1_wow.tpl.log 0m29.842s
 bench_R14B04_1.2.x_wow.tpl.log 0m24.178s
 bench_R15B_1.1.1_wow.tpl.log   0m20.493s
 bench_R15B_1.2.x_wow.tpl.log   0m19.584s
 
 (Full logs at [5])
 
 
 # Description
 
 All of these are on Mac OS X 10.7.3 on an SSD.
 
 I'll be running the same set on spinning disk and then Robert N asked
 me to populate the DBs not using builk docs. Since that's gonna take
 a while, I'll probably run this overnight.
 
 All of the results are generated by my fork of Jason's slow_couchdb[1]
 and Filipe's seatoncouch[2].
 
 The changes I've made is have the small_doc test run with 500k instead
 of 50k docs, added .view files to match the tpl files in
 seatoncouch/templates/* so we can have similar views use the different
 doc structures.
 
 I also added two scripts to orchestrate the above testing in a more
 automated fashion. It also allows you to run the full matrix yourself.
 All you need is set up homebrew allow `brew switch erlang R14B04` and
 R15B (which is controlled in matrix.sh[3]) and have a git checkout of the
 CouchDB sources that allow you to do `git checkout 1.1.1` or `1.2.x`
 (which is controlled in runner.sh[4], adjust the path to the git checkout
 there as well).
 
 matrix.sh also allows you to specify which docs to run.
 
 Please shout if you need any more info about this test run or how to
 run this yourself.
 
 
 # Analysis
 
 Inconclusive, I'l like to run this on larger dbs in general to see if
 there are more differences that shake out and I've yet have to run this
 on a spinning disk let alone another OS* or more complex view functions
 or larger design docs (like the one Stefan had).
 
 * It shouldn't be too much work to port slow_couchdb to other OSs, I'll
 definitely be looking into that, but we can do with every bit of help :)
 
 So far, I'm happy to conclude that while there are definitely provable
 differences, that we can live with them.
 
 Cheers
 Jan
 -- 
 
 
 [1]: https://github.com/janl/slow_couchdb
 [2]: https://github.com/janl/seatoncouch
 [3]: https://github.com/janl/slow_couchdb/blob/master/matrix.sh
 [4]: https://github.com/janl/slow_couchdb/blob/master/runner.sh
 [5]: 

Advice on policy merging non-committer branches

2012-02-28 Thread Jason Smith
I would like to merge a branch from a non-committer[1]. The log shows
a non-apache author, but an apache committer.

What is the policy regarding this? I was thinking the following:

1. Merge freely and promiscuously from anybody in my GitHub (or
whatever) repo (community engagement)
2. As the branch nears time for promotion, ask the non-committer to
git format-patch and attach to JIRA, signing (checking) the license
transfer.
3. With that settled, either git rebase or `git am` (I'm unclear about
this). The point is, get an @apache.org committer id on each commit.
4. Push where appropriate into the ASF repo

Questions:

Must the non-committer attach the exact same commit id? Or is it
sufficient that it merely be the same diff (delta)? (I changed the ID
when I rebased his commit and added my email to the committer header.)

Before the JIRA license agreement, may we push non-committers' code to
the repo at all?

Before the JIRA license agreement, may we push non-committers' code to
the more official branches: master, 1.2.x, etc.?

May we push whatever we want so long as the license agreement is
signed (checked) before voting on a release artifact?

[1]: 
https://github.com/jhs/couchdb/commit/1451ee57f2afdade5b24c3fb4ae37efadf9ef1ed


[jira] [Commented] (COUCHDB-1416) the requested_path that is passed to a show is wrong on a vhost with a path

2012-02-28 Thread Jason Smith (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-1416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13218890#comment-13218890
 ] 

Jason Smith commented on COUCHDB-1416:
--

@Jan, would you kindly re-open this ticket (I cannot) so that Ryan can add an 
attachment? Thanks.

@Ryan, would you please run this:

git format-patch 47c81f4c25f5f9ec4ef60c4ea638d77118b9a9ee -1

And attach the 0001-Testing-*.patch file to this ticket and click the copyright 
agreement.

 the requested_path that is passed to a show is wrong on a vhost with a path 
 

 Key: COUCHDB-1416
 URL: https://issues.apache.org/jira/browse/COUCHDB-1416
 Project: CouchDB
  Issue Type: Bug
  Components: HTTP Interface
Affects Versions: 1.2
Reporter: Ryan Ramage
Priority: Minor
 Attachments: 
 A_0001-Testing-requested_path-for-various-combinations-of-r.patch, 
 A_0002-Compatibility-with-the-CLI-test-runner.patch, 
 A_0003-Store-the-entire-requested-path-in-x-couchdb-vhost-f.patch, 
 A_0004-For-a-vhost-correctly-reflect-true-requested-path.patch


 In a show or list, it is impossible to construct a full url that an end user 
 could use to re-request the resource, given the various combinations of 
 vhosts and rewrites. 
 The major one is if the vhost contains a path component, this path 
 information is not passed to the show at all. 
 I have created three tests that highlight the condition, currently failing 
 for one test, with the two passing to prevent regressions.
 The commit can be found here:
 https://github.com/ryanramage/couchdb/commit/e9417480e2ce160f359d9508dcec3d4e56045a60
 I have talked this over with JasonSmith and bennoitc on #couchdb and they 
 asked me to write the tests and raise the jira. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Reopened] (COUCHDB-1416) the requested_path that is passed to a show is wrong on a vhost with a path

2012-02-28 Thread Adam Kocoloski (Reopened) (JIRA)

 [ 
https://issues.apache.org/jira/browse/COUCHDB-1416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Kocoloski reopened COUCHDB-1416:
-


+JasonSmith: Actually, would anybody who is able please reopen COUCHDB-1416 so 
that Ryan can add an attachment. Thanks!

 the requested_path that is passed to a show is wrong on a vhost with a path 
 

 Key: COUCHDB-1416
 URL: https://issues.apache.org/jira/browse/COUCHDB-1416
 Project: CouchDB
  Issue Type: Bug
  Components: HTTP Interface
Affects Versions: 1.2
Reporter: Ryan Ramage
Priority: Minor
 Attachments: 
 A_0001-Testing-requested_path-for-various-combinations-of-r.patch, 
 A_0002-Compatibility-with-the-CLI-test-runner.patch, 
 A_0003-Store-the-entire-requested-path-in-x-couchdb-vhost-f.patch, 
 A_0004-For-a-vhost-correctly-reflect-true-requested-path.patch


 In a show or list, it is impossible to construct a full url that an end user 
 could use to re-request the resource, given the various combinations of 
 vhosts and rewrites. 
 The major one is if the vhost contains a path component, this path 
 information is not passed to the show at all. 
 I have created three tests that highlight the condition, currently failing 
 for one test, with the two passing to prevent regressions.
 The commit can be found here:
 https://github.com/ryanramage/couchdb/commit/e9417480e2ce160f359d9508dcec3d4e56045a60
 I have talked this over with JasonSmith and bennoitc on #couchdb and they 
 asked me to write the tests and raise the jira. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira