well, if I set "batch" to true, I all of my load scripts die after a
short amount of time with this error:
/var/lib/gems/1.8/gems/couchrest-0.24/lib/couchrest/monkeypatches.rb:41:in
`rbuf_fill': uninitialized constant Timeout::TimeoutError (NameError)
from /usr/lib/ruby/1.8/net/protocol.rb:116:in `readuntil'
from /usr/lib/ruby/1.8/net/protocol.rb:126:in `readline'
from /usr/lib/ruby/1.8/net/http.rb:2020:in `read_status_line'
Regardless, it still seems like there is a bottleneck on the server
end. Did I mention I'm running the 'load' scripts locally? So it's
not network latency that is causing the slowness. Any other ideas?
Thanks.
-Tom
On Mon, May 4, 2009 at 12:19 PM, Zachary Zolton
<[email protected]> wrote:
> Yeah, the optional second argument —for usign bulk save semantics—
> defaults to false.
>
> Also, there's an option where you can set how many documents to batch
> save at a time. I don't remember the default, but I've had good luck
> saving with anywhere between 500 and 2000 docs.
>
> On Mon, May 4, 2009 at 11:13 AM, Tom Nichols <[email protected]> wrote:
>> Thanks. I'm using save_doc, I just need to pass 'true' as a second argument?
>>
>> I posted the question here because I assumed the performance
>> bottleneck was on the CouchDB end, not my ruby script. Am I wrong? I
>> assumed if I was running 20 "slow" ruby scripts they would peg the
>> CPU. The fact that I'm not seeing that makes me think there is some
>> blocking/ synchronization that is making the CouchDB server slow....?
>>
>> Thanks again.
>> -Tom
>>
>> On Mon, May 4, 2009 at 11:58 AM, Zachary Zolton
>> <[email protected]> wrote:
>>> Short answer: use db.save_doc(hash, true) for bulk_docs behavior.
>>>
>>> Also, consider moving this thread to the CouchRest Google Group:
>>> http://groups.google.com/group/couchrest/topics
>>>
>>> Cheers,
>>> zdzolton
>>>
>>> On Mon, May 4, 2009 at 10:40 AM, Tom Nichols <[email protected]> wrote:
>>>> Hi, I have some questions about insert performance.
>>>>
>>>> I have a single CouchDB 0.9.0 node running on small EC2 instance. I
>>>> attached a huge EBS volume to it and mounted it where CouchDB's data
>>>> files are stored. I fired up about ruby scripts running inserts and
>>>> after a weekend I only have about 30GB/ 12M rows of data... Which
>>>> seems small. 'top' tells me that my CPU is only about 30% utilized.
>>>>
>>>> Any idea what I might be doing wrong? I pretty much just followed
>>>> these instructions:
>>>> http://wiki.apache.org/couchdb/Getting_started_with_Amazon_EC2
>>>>
>>>> My ruby script looks like this:
>>>> #!/usr/bin/env ruby
>>>> #Script to load random data into CouchDB
>>>>
>>>> require 'rubygems'
>>>> require 'couchrest'
>>>>
>>>> db = CouchRest.database! "http://127.0.0.1:5984/#{ARGV[0]}"
>>>> puts "Created database: #{ARGV[0]}"
>>>>
>>>> max = 9999999999999999
>>>> while 1
>>>> puts 'loading...'
>>>> for val in 0..max
>>>> db.save_doc({ :key => val, 'val one' => "val ${val}",
>>>> 'val2' => "#{ARGV[1]} #{val}" })
>>>> end
>>>> end
>>>>
>>>>
>>>> Thanks in advance...
>>>>
>>>
>>
>