On Mon, Jan 27, 2014 at 6:14 AM, Benoit Chesneau <bchesn...@gmail.com> wrote:
> On Mon, Jan 27, 2014 at 3:08 AM, Alexander Shorin <kxe...@gmail.com> wrote:
>
>> On Mon, Jan 27, 2014 at 5:56 AM, Benoit Chesneau <bchesn...@gmail.com>
>> wrote:
>> > On Mon, Jan 27, 2014 at 2:50 AM, Alexander Shorin <kxe...@gmail.com>
>> wrote:
>> >
>> >> On Sun, Jan 26, 2014 at 6:44 PM, Dirkjan Ochtman <dirk...@ochtman.nl>
>> >> wrote:
>> >> > On Wed, Jan 22, 2014 at 9:22 PM, Dirkjan Ochtman <dirk...@ochtman.nl>
>> >> wrote:
>> >> >>>   - Action: kocolosk and/or Kxepal will try to look and solve
>> >> COUCHDB-1986 issue
>> >> >
>> >> > Any progress so far?
>> >>
>> >> I'm very sure that this is something related to Erlang itself since
>> >> I'd failed to reproduce this issue on FreeBSD 9.1 (spidermonkey 1.7.0,
>> >> erlang 15B02, vbox guest) for long series of test runs, while it
>> >> always raises on FreeBSD 10 with Erlang R16B02 which Dave gave me for
>> >> testing. I also tried to run this test within same environment for
>> >> older releases in attempts to locate broken commit, but our 1.5 and
>> >> 1.4 releases are also affected to the same issue. Again, everything is
>> >> fine on host with R15.
>> >>
>> >>
>> > Well see my comments and the one from dch. It may be another cause than
>> > Erlang. It's most probably something deep in the couch_replicator code.
>> > Latest changes init make the problem disappear on my machine while Andy
>> was
>> > still able to reproduce it. So in something is preventing the replicator
>> to
>> > timeout correctly.
>>
>> About side effect from COUCHDB-1953? I'm not sure that this is related
>> (but it could introduce accidental "fix" since attachments replication
>> becomes faster) since for now I see that this issue is strongly
>> depended from OS and Erlang version.
>>
>> No. see the *latest* comment.
>
> https://issues.apache.org/jira/browse/COUCHDB-1986?focusedCommentId=13882243
>
> What I am swaying is that even if the fix is unrelated it is actually
> fixing this error on my mac. I am pretty sure that this error don't happen
> on other systems because they are fast enough. I am actually wondering what
> is preventing it to timeout. Also not that it was also harder to reproduce
> in 1.5 so...

Couldn't say why, but I can surely say where:
https://github.com/apache/couchdb/blob/master/src/couch_replicator/src/couch_replicator_httpc.erl#L65
changing infinity to some mean value (like 10-20-30 secs) helps
replicator to fail with timeout error instead of wait forever for the
response.

--
,,,^..^,,,

Reply via email to