On Mon, Oct 10, 2011 at 11:03 PM, Chris Stockton
<[email protected]> wrote:
> Hello,
>
> On Mon, Oct 10, 2011 at 5:18 PM, Adam Kocoloski <[email protected]> wrote:
>> On Oct 10, 2011, at 8:02 PM, Chris Stockton wrote:
>>
>>> Hello,
>>>
>>> On Mon, Oct 10, 2011 at 4:19 PM, Filipe David Manana
>>> <[email protected]> wrote:
>>>> On Tue, Oct 11, 2011 at 12:03 AM, Chris Stockton
>>>> <[email protected]> wrote:
>>>> Chris,
>>>>
>>>> That said work is in the'1.2.x' branch (and master).
>>>> CouchDB recently migrated from SVN to GIT, see:
>>>> http://couchdb.apache.org/community/code.html
>>>>
>>>
>>> Thank you very much for the response Filipe, do you possibly have any
>>> documentation or more detailed summary on what these changes include
>>> and possible benefits of them? I would love to hear about any tweaking
>>> or replication tips you may have for our growth issues, perhaps you
>>> could answer a basic question if nothing else: Do the changes in this
>>> branch minimize the performance impact of continuous replication on
>>> many databases?
>>>
>>> Regardless I plan on getting a build of that branch and doing some
>>> testing of my own very soon.
>>>
>>> Thank you!
>>>
>>> -Chris
>>
>> I'm pretty sure that even in 1.2.x and master each replication with a remote
>> source still requires one dedicated TCP connection to consume the _changes
>> feed. Replications with a local source have always been able to use a
>> connection pool per host:port combination. That's not to downplay the
>> significance of the rewrite of the replicator in 1.2.x; Filipe put quite a
>> lot of time into it.
>>
>> The link to "those darn errors" just pointed to the mbox browser for
>> September 2011. Do you have a more specific link? Regards,
>>
>> Adam
>
> Well I will remain optimistic that the rewrite could hopefully have
> solved several of my issues regardless I hope. I guess the idle TCP
> connections by themselves are not too bad, when they all start to work
> simultaneously I think is what becomes the issue =)
>
> Sorry Adam, here is a better link
> http://mail-archives.apache.org/mod_mbox/couchdb-user/201109.mbox/%3ccalkfbxuugljjy-nh46u0u584l+xdqm3ngspenxsjyrxospe...@mail.gmail.com%3E,
> the actual text was:
>
> ---------------
>
> It seems that randomly I am getting errors about crashes as our
> replicator runs, all this replicator does is make sure that all
> databases on the master server replicate to our failover by checking
> status.
>
> Details:
> - I notice the below error in the logs, anywhere from 0 to 30 at a time.
> - It seems that a database might start replicating okay then stop.
> - These errors [1] are on the failover pulling from master
> - No errors are displayed on the master server
> - The databases inside the URL in the db_not_found portion of the
> error, are always available from curl from the failover machine, which
> makes the error strange, somehow it thinks it can't find the database
> - Master seems healthy at all times, all database are available, no
> errors in log
>
> [1] --
> [Mon, 12 Sep 2011 18:34:14 GMT] [error] [<0.22466.5305>]
> {error_report,<0.30.0>,
> {<0.22466.5305>,crash_report,
> [[{initial_call,{couch_rep,init,['Argument__1']}},
> {pid,<0.22466.5305>},
> {registered_name,[]},
> {error_info,
> {exit,
> {db_not_found,
> <<"http://user:pass@server:5984/db_10944/">>},
> [{gen_server,init_it,6},
> {proc_lib,init_p_do_apply,3}]}},
> {ancestors,
> [couch_rep_sup,couch_primary_services,
> couch_server_sup,<0.31.0>]},
> {messages,[]},
> {links,[<0.81.0>]},
> {dictionary,[]},
> {trap_exit,true},
> {status,running},
> {heap_size,2584},
> {stack_size,24},
> {reductions,794}],
> []]}}
>
One place I've seen this error pop up when it looks like it shouldn't
is if couch_server gets backed up. If you remsh into one of those db's
you could try the following:
> process_info(whereis(couch_server), message_queue_len).
And if that number keeps growing, that could be the issue.