Re: Seeking help for "E000054: Error retrieving REPORT: Connection reset by peer"

2014-01-15 Thread Branko Čibej
On 09.01.2014 20:19, Branko Čibej wrote:
> On 09.01.2014 17:09, Mojca Miklavec wrote:
>> I'm unable to reproduce the faulty behaviour if I do a checkout from
>> the same network where the server is located, no matter what I try
>> (upgrading SVN client doesn't "help" triggering the error). Philip
>> also said that he had no problem doing a checkout with client version
>> 1.8.5 or 1.7.
>
> This confirms my suspicion that the error is triggered by some part of
> the network infrastructure between your server and the outside world.
> That's why I asked if there is a load-balancer involved. It could also
> be caused by some kind of transparent proxy, or even a packet
> analyzer. I doubt that your server is open to the world without some
> kind of security measures in place.
>
> To be clear, I'm not saying that any of these things are configured
> incorrectly; only that they may be interacting with Subversion in a
> way that we don't handle well. One of the major differences between
> 1.7 (which works) and 1.8 (which fails) is that we try to work around
> issues with non-standard behaviour of certain "transparent" (sic)
> proxies; and we can't claim to have covered all the possibilities.
>
> I can't see a way to figure out what's going on without help from your
> network admins; we need some insight into why the connection is being
> reset on the server side, and analyzing the TCP stream on the client
> can't tell us that.
>
>
> BTW, if you think it'd help to try a live debugging session, I'm only
> about an hour's drive away from IJS.

So to wrap this up: they managed to fix the problem themselves, and it
was indeed "some part of the network infrastructure;" the specifics are
as follows:

They have a Cisco ASA 5580 running IOS 9.1(4), and they had HTTP
protocol inspection turned on; the configuration was as follows:

policy-map type inspect http HTTP-CONTROL
 parameters
  protocol-violation action log
policy-map global_policy
 class inspection_default
  inspect http HTTP-CONTROL


The ASA was closing the connections, and their logs contained one of the
following two reasons:

%ASA-4-415016: policy-map HTTP-CONTROL:Maximum number of unanswered HTTP 
requests exceeded - Resetting connection from Ext:x.x.x.x/59769 to 
Int:y.y.y.y/80
%ASA-4-507003: tcp flow from Ext:x.x.x.x/59769 to Int:y.y.y.y/80 terminated by 
inspection engine, reason - reset unconditionally.


The only reasonable explanation we could come up with was that
moderately low bandwidth and high latency between client and server,
coupled with the fact that some of the files in the repository are
rather large and take a while to transfer, caused the 1.8 client to
queue up enough pipelined GET requests during checkout that the firewall
decided to call quits. A 1.7 client (using serf) did not exhibit this
problem, because it also sends PROPGETS, and this apparently changed the
timing enough that the number of pipelined requests never exceeded the
ASA's configured maximum.

Apparently this is not a new problem, having been reported before:
https://supportforums.cisco.com/thread/2088590

They fixed the issue by switching off HTTP protocol inspection on the
ASA. Interestingly enough, this also fixed a number of intermittent
issues with plain ol' Web browsing that they had on occasion, so this is
not specific to Subversion (as the link above also suggests), but is
rather a bug^Wserious limitation of the ASA and/or IOS.

-- Brane


-- 
Branko Čibej | Director of Subversion
WANdisco // Non-Stop Data
e. br...@wandisco.com


Re: Seeking help for "E000054: Error retrieving REPORT: Connection reset by peer"

2014-01-09 Thread Philip Martin
Ben Reser  writes:

> Actually we know we haven't covered all possibilities.  Had someone a while
> back that had mod_security setup in such a way that it was rejecting some
> request methods (think it was POST) without Content-Length (thus breaking
> chunked requests).  The behavior didn't fail for the OPTION requests so our
> probe to try and work around transparent proxies failed.
>
> But I'm not sure what this thread would really have to do with chunked
> requests, since the problem seems to be pipelining which as far as I know we
> don't have any workarounds for.
>
> We can rule out the chunked requests by disabling it by adding this to the
> command line --config-option servers:global:http-chunked-requests=no and 
> seeing
> if it changes anything.  But I really doubt it based on what I've seen on this
> thread.

Disabling chunked requests makes no difference.  I see a trunk client
failing most of the time, but occasionally it succeeds.  When it fails
it ususally hangs and eventually times-out, but occasionally it fails
with "Connection reset by peer".  1.8 fails like trunk, but 1.7/serf
works.

If we believe Wireshark then the server is sending an RST part way
through the response to the first of 13 pipelined GETs.  The response
reponse is not chunked, it has Content-Length:8407648, but the client
only receives 14480 bytes (I think that includes the headers).

1.7/serf, which works, pipelines 13 PROPFINDs as well as 13 GETs.  If I
force trunk to pipeline the PROPFINDs using:

Index: ../src/subversion/libsvn_ra_serf/update.c
===
--- ../src/subversion/libsvn_ra_serf/update.c   (revision 1557003)
+++ ../src/subversion/libsvn_ra_serf/update.c   (working copy)
@@ -1630,7 +1630,7 @@
 
   val = svn_xml_get_attr_value("inline-props", attrs);
   if (val && (strcmp(val, "true") == 0))
-ctx->add_props_included = TRUE;
+ctx->add_props_included = FALSE;
 
   val = svn_xml_get_attr_value("send-all", attrs);
   if (val && (strcmp(val, "true") == 0))
@@ -1638,7 +1638,7 @@
   ctx->send_all_mode = TRUE;
 
   /* All properties are included in send-all mode. */
-  ctx->add_props_included = TRUE;
+  ctx->add_props_included = FALSE;
 }
 }
   else if (state == NONE && strcmp(name.name, "target-revision") == 0)

then the checkout with trunk starts working reliably.  Wireshark no
longer shows an RST from the server, it does however show some packets
marked "TCP Previous segment not captured" and some marked "TCP Dup
ACK".

-- 
Philip Martin | Subversion Committer
WANdisco // *Non-Stop Data*


Re: Seeking help for "E000054: Error retrieving REPORT: Connection reset by peer"

2014-01-09 Thread Ben Reser
On 1/9/14, 11:19 AM, Branko Čibej wrote:
> To be clear, I'm not saying that any of these things are configured
> incorrectly; only that they may be interacting with Subversion in a way that 
> we
> don't handle well. One of the major differences between 1.7 (which works) and
> 1.8 (which fails) is that we try to work around issues with non-standard
> behaviour of certain "transparent" (sic) proxies; and we can't claim to have
> covered all the possibilities.

Actually we know we haven't covered all possibilities.  Had someone a while
back that had mod_security setup in such a way that it was rejecting some
request methods (think it was POST) without Content-Length (thus breaking
chunked requests).  The behavior didn't fail for the OPTION requests so our
probe to try and work around transparent proxies failed.

But I'm not sure what this thread would really have to do with chunked
requests, since the problem seems to be pipelining which as far as I know we
don't have any workarounds for.

We can rule out the chunked requests by disabling it by adding this to the
command line --config-option servers:global:http-chunked-requests=no and seeing
if it changes anything.  But I really doubt it based on what I've seen on this
thread.

More details on what Branko is talking about and the config option I mentioned
here:
https://subversion.apache.org/docs/release-notes/1.8.html#411-length-required



Re: Seeking help for "E000054: Error retrieving REPORT: Connection reset by peer"

2014-01-09 Thread Branko Čibej
On 09.01.2014 17:09, Mojca Miklavec wrote:
> I'm unable to reproduce the faulty behaviour if I do a checkout from
> the same network where the server is located, no matter what I try
> (upgrading SVN client doesn't "help" triggering the error). Philip
> also said that he had no problem doing a checkout with client version
> 1.8.5 or 1.7.

This confirms my suspicion that the error is triggered by some part of
the network infrastructure between your server and the outside world.
That's why I asked if there is a load-balancer involved. It could also
be caused by some kind of transparent proxy, or even a packet analyzer.
I doubt that your server is open to the world without some kind of
security measures in place.

To be clear, I'm not saying that any of these things are configured
incorrectly; only that they may be interacting with Subversion in a way
that we don't handle well. One of the major differences between 1.7
(which works) and 1.8 (which fails) is that we try to work around issues
with non-standard behaviour of certain "transparent" (sic) proxies; and
we can't claim to have covered all the possibilities.

I can't see a way to figure out what's going on without help from your
network admins; we need some insight into why the connection is being
reset on the server side, and analyzing the TCP stream on the client
can't tell us that.


BTW, if you think it'd help to try a live debugging session, I'm only
about an hour's drive away from IJS.


-- Brane


-- 
Branko Čibej | Director of Subversion
WANdisco // Non-Stop Data
e. br...@wandisco.com


Re: Seeking help for "E000054: Error retrieving REPORT: Connection reset by peer"

2014-01-09 Thread Mojca Miklavec
On Wed, Jan 8, 2014 at 4:53 PM, Philip Martin wrote:
>
> I get a problem with the checkout from your server using a trunk client.
> Very occasionally the checkout works but most of the time the client
> simply hangs while receiving the first file.
>
> It appears that the client is sending the REPORT request and receiving
> the response from the server.  The client then pipelines 13 GET requests
> corresponding to the 13 files in the working copy.  The server starts
> sending the response to the first GET and the client starts receiving it
> but the server never completes the response.  The client hangs waiting
> for the server and eventually times out.
>
> If I use wireshark it shows the server sending an RST packet just before
> the client hangs.  According to wireshark this is a "Bad checksum"
> packet.  Wireshark shows the client retransmitting the GETs but there is
> no further server repsonse.
>
> I don't know enough to debug the problem further.

I now upgraded the SVN client to 1.9.0-dev from trunk. With trunk
version it's still inconsistent behaviour, but at least reproducible
to a certain extent.

I tried to checkout file a couple of times. Almost every time I get
the following lines in error.log on the server:

Unable to deliver content.  [500, #0]
Could not write data to filter.  [500, #175002]

but the first time the whole checkout finished successfully, even
though the server first recorded "500" and apparently another "200"
(success) on the second attempt for the same file. The client ended
with success.

The second time the client reported
svn: E54: Error running context: Connection reset by peer
(and the same happened when I ran it for the third/fourth/fifth/...
time) Sometimes it works though. And it usually hangs on different
files.

I'm unable to reproduce the faulty behaviour if I do a checkout from
the same network where the server is located, no matter what I try
(upgrading SVN client doesn't "help" triggering the error). Philip
also said that he had no problem doing a checkout with client version
1.8.5 or 1.7.

With subversion client 1.8.5 I'm sometimes able to reproduce the
problem from a different network, but it usually works. I tried
wireshark, but I don't know what to do with the zillions of packets it
shows me.

I'll first try to copy the repository to another server to see if I
could reproduce the problem from there. Other than that I would be
grateful for any hints if there exists some painless way to debug the
server.

Mojca


Re: Seeking help for "E000054: Error retrieving REPORT: Connection reset by peer"

2014-01-08 Thread Philip Martin
Mojca Miklavec  writes:

> On Wed, Jan 8, 2014 at 1:11 PM, Branko Čibej wrote:
>> Hi Mojca,
>>
>> On 07.01.2014 20:58, Mojca Miklavec wrote:
>>> (The other problem with "Error retrieving REPORT" is still a mystery
>>> though.) Mojca
>>
>> I'm assuming your server is somewhere on the IJS network. Can you please
>> ask the admins there if your server is behind a load balancer?
>
> I asked and it is not.

I get a problem with the checkout from your server using a trunk client.
Very occasionally the checkout works but most of the time the client
simply hangs while receiving the first file.

It appears that the client is sending the REPORT request and receiving
the response from the server.  The client then pipelines 13 GET requests
corresponding to the 13 files in the working copy.  The server starts
sending the response to the first GET and the client starts receiving it
but the server never completes the response.  The client hangs waiting
for the server and eventually times out.

If I use wireshark it shows the server sending an RST packet just before
the client hangs.  According to wireshark this is a "Bad checksum"
packet.  Wireshark shows the client retransmitting the GETs but there is
no further server repsonse.

I don't know enough to debug the problem further.

-- 
Philip Martin | Subversion Committer
WANdisco // *Non-Stop Data*


Re: Seeking help for "E000054: Error retrieving REPORT: Connection reset by peer"

2014-01-08 Thread Mojca Miklavec
On Wed, Jan 8, 2014 at 1:11 PM, Branko Čibej wrote:
> Hi Mojca,
>
> On 07.01.2014 20:58, Mojca Miklavec wrote:
>> (The other problem with "Error retrieving REPORT" is still a mystery
>> though.) Mojca
>
> I'm assuming your server is somewhere on the IJS network. Can you please
> ask the admins there if your server is behind a load balancer?

I asked and it is not.

Mojca


Re: Seeking help for "E000054: Error retrieving REPORT: Connection reset by peer"

2014-01-08 Thread Branko Čibej
Hi Mojca,

On 07.01.2014 20:58, Mojca Miklavec wrote:
> (The other problem with "Error retrieving REPORT" is still a mystery
> though.) Mojca 

I'm assuming your server is somewhere on the IJS network. Can you please
ask the admins there if your server is behind a load balancer?

-- Brane

-- 
Branko Čibej | Director of Subversion
WANdisco // Non-Stop Data
e. br...@wandisco.com


Re: Seeking help for "E000054: Error retrieving REPORT: Connection reset by peer"

2014-01-07 Thread Mojca Miklavec
On Tue, Jan 7, 2014 at 8:44 PM, Ryan Schmidt wrote:
> On Jan 7, 2014, at 12:01, Mojca Miklavec wrote:
>> On Tue, Jan 7, 2014 at 5:47 PM, Philip Martin wrote:
>>
>>> Which version of Apache are you using?  Which Apache MPM are you using?
>>
>> Server version: Apache/2.4.7 (Unix)
>>
>> I'm not sure how to check MPM. I get
>>
>>> httpd -l
>> Compiled in modules:
>>  core.c
>>  mod_so.c
>>  http_core.c
>>
>> but "httpd -V" as suggested on some websites doesn't work. How should
>> I check which MPM is being used?
>
> In what way does “httpd -V” not work?

In this way:

> httpd -V
[] [so:warn] [pid 63924] AH01574: module dav_svn_module is
already loaded, skipping
[] [so:warn] [pid 63924] AH01574: module authz_svn_module is
already loaded, skipping
AH00548: NameVirtualHost has no effect and will be removed in the next
release /path/to/00-vhosts.conf:1
(13)Permission denied: AH02291: Cannot access directory
'/path/to/logs/1/' for error log of vhost defined at
/path/to/20-another.conf:4
...
... (repeats a bunch of times)
...
AH00014: Configuration check failed


But I saw the trick now. It wants me to use "sudo httpd -V" for some
reason, then it works. And yes, it's "prefork" in my case as well, but
that's probably no longer relevant now that one mystery with forgotten
Apache restart was solved.

(The other problem with "Error retrieving REPORT" is still a mystery though.)

Mojca


Re: Seeking help for "E000054: Error retrieving REPORT: Connection reset by peer"

2014-01-07 Thread Ryan Schmidt

On Jan 7, 2014, at 12:01, Mojca Miklavec  wrote:

> On Tue, Jan 7, 2014 at 5:47 PM, Philip Martin wrote:
> 
>> Which version of Apache are you using?  Which Apache MPM are you using?
> 
> Server version: Apache/2.4.7 (Unix)
> 
> I'm not sure how to check MPM. I get
> 
>> httpd -l
> Compiled in modules:
>  core.c
>  mod_so.c
>  http_core.c
> 
> but "httpd -V" as suggested on some websites doesn't work. How should
> I check which MPM is being used?

In what way does “httpd -V” not work? On my Mac it gives me the answer (“Server 
MPM: prefork”):

$ httpd -V
Server version: Apache/2.4.7 (Unix)
Server built:   Nov 26 2013 23:32:37
Server's Module Magic Number: 20120211:27
Server loaded:  APR 1.4.8, APR-UTIL 1.5.2
Compiled using: APR 1.4.8, APR-UTIL 1.5.2
Architecture:   64-bit
Server MPM: prefork
  threaded: no
forked: yes (variable process count)
Server compiled with
 -D APR_HAS_SENDFILE
 -D APR_HAS_MMAP
 -D APR_HAVE_IPV6 (IPv4-mapped addresses enabled)
 -D APR_USE_SYSVSEM_SERIALIZE
 -D APR_USE_PTHREAD_SERIALIZE
 -D SINGLE_LISTEN_UNSERIALIZED_ACCEPT
 -D APR_HAS_OTHER_CHILD
 -D AP_HAVE_RELIABLE_PIPED_LOGS
 -D DYNAMIC_MODULE_LIMIT=256
 -D HTTPD_ROOT="/opt/local"
 -D SUEXEC_BIN="/opt/local/bin/suexec"
 -D DEFAULT_PIDLOG="var/run/apache2/httpd.pid"
 -D DEFAULT_SCOREBOARD="logs/apache_runtime_status"
 -D DEFAULT_ERRORLOG="logs/error_log"
 -D AP_TYPES_CONFIG_FILE="etc/apache2/mime.types"
 -D SERVER_CONFIG_FILE="etc/apache2/httpd.conf"



Re: Seeking help for "E000054: Error retrieving REPORT: Connection reset by peer"

2014-01-07 Thread Mojca Miklavec
On Tue, Jan 7, 2014 at 7:34 PM, Philip Martin wrote:
>
> So you used dump/load to create a new repository and then replaced the
> old repository with the new repository?  If you did that while Apache
> was running, without restarting Apache, then that explains the 'Corrupt
> node-revision' error as you changed the data on disk.

Ah, thanks a lot for explaining that. Yes, I did dump/load the old
repository into a new one because I wanted to test if it would solve
the problem

(on client)
svn: E54: Error retrieving REPORT: Connection reset by peer
(on server)
[dav:error] [pid 3613] [client ] Unable to deliver content.  [500, #0]
[dav:error] [pid 3613] [client ] Could not write data to
filter.  [500, #175002]

which it didn't. It only added a few additional problems until I
restarted Apache (I'm sorry for confusing you with those), but the
initial error E54/175002 is still causing problems.

> What you are left with is some sort of intermittent network problem.  I
> don't know what is causing that.

Is there any way to debug that?

Thank you very much,
Mojca


Re: Seeking help for "E000054: Error retrieving REPORT: Connection reset by peer"

2014-01-07 Thread Philip Martin
Mojca Miklavec  writes:

> Ah, OK, I see it now in the old logs. There are no such lines in the
> latest logs.

> The repository is stored on a local disk. I'm not sure what about
> filesystem is it that you are asking, but here are some possibly
> relevant data:
>
>> cat format
> 5
>> cat db/fs-type
> fsfs
>> cat db/format
> 6
> layout sharded 1000
>
> (and before I upgraded the repository, db/format was 4). Is that what
> you were asking or do did you want to know something else?

So you used dump/load to create a new repository and then replaced the
old repository with the new repository?  If you did that while Apache
was running, without restarting Apache, then that explains the 'Corrupt
node-revision' error as you changed the data on disk.

What you are left with is some sort of intermittent network problem.  I
don't know what is causing that.

-- 
Philip Martin | Subversion Committer
WANdisco // *Non-Stop Data*


Re: Seeking help for "E000054: Error retrieving REPORT: Connection reset by peer"

2014-01-07 Thread Mojca Miklavec
On Tue, Jan 7, 2014 at 5:47 PM, Philip Martin wrote:
> Mojca Miklavec writes:
>> On Tue, Jan 7, 2014 at 5:18 PM, Philip Martin wrote:
>>> Mojca Miklavec writes:
>>>
 Yes, there is still a problem after restarting Apache. Even though it
 works for me at the moment and I tried fetching from multiple
 locations and servers, other users are still experiencing the same
 problem. Logs on the server confirm that. (Unable to deliver content.
 [500, #0] + Could not write data to filter.  [500, #175002])
>>>
>>> Does the server log always contain the error:
>>>
>>>svn: E160004: Corrupt node-revision '2-1.0.r137/330061'
>>
>> I don't see that in the server log, but I was only checking error.log
>> written by Apache server, I don't know where else to look, but I can
>> check if you point me in the right direction. This error is sometimes
>> displayed by the "client" (either in XML in the browser or as an error
>> in the command line during "svn up"), but it's not consistent and it
>> often works properly.
>
> It would be in the Apache error log.

Ah, OK, I see it now in the old logs. There are no such lines in the
latest logs.

> Are you saying that sometimes the client gets the E175002 error without
> the 'Corrupt node-revision' part?

Yes. I'm attaching full log (with timestamps and IPs removed) for a
certain period of time around 4th January. There are plenty of E175002
errors without any subsequent 'Corrupt node-revision' part, including
all the latest entries (not part of the attachment).

> Are you saying that the client gets the 'Corrupt node-revision' error
> but it is not recorded in the error log?

I was wrong about that. I was only checking the latest error log where
all I get is

[dav:error] [pid 42289] [:29011] Unable to deliver content.  [500, #0]
[dav:error] [pid 42289] [:29011] Could not write data to filter.
[500, #175002]

But I've found those additonal errors in an old (archived) log. At the
moment I'm unable to reproduce the error 'Corrupt node-revision' both
on the client and in server logs, but the repository is still
misbehaving.

>> It sometimes works in the first attempt, fails in the second one, and
>> succeeds in the third attempt again. Only seconds or minutes apart.
>>
>>> Is it always '2-1.0.r137/330061'?
>>
>> The exact revision reported as currupt depends on which subfolder I'm
>> checking out. I believe it reports the last commit when files in that
>> particular subfolder were modified. (I've seen this error when
>> checking out two different subfolders. The number was always the same
>> for the same subfolder, but different for different subfolders.)
>>
>> (It is a bit difficult to test because the behaviour is not consistent.)
>
> Which version of Apache are you using?  Which Apache MPM are you using?

Server version: Apache/2.4.7 (Unix)

I'm not sure how to check MPM. I get

> httpd -l
Compiled in modules:
  core.c
  mod_so.c
  http_core.c

but "httpd -V" as suggested on some websites doesn't work. How should
I check which MPM is being used?

> What sort of filesystem is used for the repository?  Is it a local disk
> or a network disk?

The repository is stored on a local disk. I'm not sure what about
filesystem is it that you are asking, but here are some possibly
relevant data:

> cat format
5
> cat db/fs-type
fsfs
> cat db/format
6
layout sharded 1000

(and before I upgraded the repository, db/format was 4). Is that what
you were asking or do did you want to know something else?

Mojca


error.log
Description: Binary data


Re: Seeking help for "E000054: Error retrieving REPORT: Connection reset by peer"

2014-01-07 Thread Philip Martin
Mojca Miklavec  writes:

> On Tue, Jan 7, 2014 at 5:18 PM, Philip Martin wrote:
>> Mojca Miklavec writes:
>>
>>> Yes, there is still a problem after restarting Apache. Even though it
>>> works for me at the moment and I tried fetching from multiple
>>> locations and servers, other users are still experiencing the same
>>> problem. Logs on the server confirm that. (Unable to deliver content.
>>> [500, #0] + Could not write data to filter.  [500, #175002])
>>
>> Does the server log always contain the error:
>>
>>svn: E160004: Corrupt node-revision '2-1.0.r137/330061'
>
> I don't see that in the server log, but I was only checking error.log
> written by Apache server, I don't know where else to look, but I can
> check if you point me in the right direction. This error is sometimes
> displayed by the "client" (either in XML in the browser or as an error
> in the command line during "svn up"), but it's not consistent and it
> often works properly.

It would be in the Apache error log.

Are you saying that sometimes the client gets the E175002 error without
the 'Corrupt node-revision' part?

Are you saying that the client gets the 'Corrupt node-revision' error
but it is not recorded in the error log?

> It sometimes works in the first attempt, fails in the second one, and
> succeeds in the third attempt again. Only seconds or minutes apart.
>
>> Is it always '2-1.0.r137/330061'?
>
> The exact revision reported as currupt depends on which subfolder I'm
> checking out. I believe it reports the last commit when files in that
> particular subfolder were modified. (I've seen this error when
> checking out two different subfolders. The number was always the same
> for the same subfolder, but different for different subfolders.)
>
> (It is a bit difficult to test because the behaviour is not consistent.)

Which version of Apache are you using?  Which Apache MPM are you using?

What sort of filesystem is used for the repository?  Is it a local disk
or a network disk?

-- 
Philip Martin | Subversion Committer
WANdisco // *Non-Stop Data*


Re: Seeking help for "E000054: Error retrieving REPORT: Connection reset by peer"

2014-01-07 Thread Mojca Miklavec
On Tue, Jan 7, 2014 at 5:18 PM, Philip Martin wrote:
> Mojca Miklavec writes:
>
>> Yes, there is still a problem after restarting Apache. Even though it
>> works for me at the moment and I tried fetching from multiple
>> locations and servers, other users are still experiencing the same
>> problem. Logs on the server confirm that. (Unable to deliver content.
>> [500, #0] + Could not write data to filter.  [500, #175002])
>
> Does the server log always contain the error:
>
>svn: E160004: Corrupt node-revision '2-1.0.r137/330061'

I don't see that in the server log, but I was only checking error.log
written by Apache server, I don't know where else to look, but I can
check if you point me in the right direction. This error is sometimes
displayed by the "client" (either in XML in the browser or as an error
in the command line during "svn up"), but it's not consistent and it
often works properly.

It sometimes works in the first attempt, fails in the second one, and
succeeds in the third attempt again. Only seconds or minutes apart.

> Is it always '2-1.0.r137/330061'?

The exact revision reported as currupt depends on which subfolder I'm
checking out. I believe it reports the last commit when files in that
particular subfolder were modified. (I've seen this error when
checking out two different subfolders. The number was always the same
for the same subfolder, but different for different subfolders.)

(It is a bit difficult to test because the behaviour is not consistent.)

Mojca


Re: Seeking help for "E000054: Error retrieving REPORT: Connection reset by peer"

2014-01-07 Thread Philip Martin
Mojca Miklavec  writes:

> Yes, there is still a problem after restarting Apache. Even though it
> works for me at the moment and I tried fetching from multiple
> locations and servers, other users are still experiencing the same
> problem. Logs on the server confirm that. (Unable to deliver content.
> [500, #0] + Could not write data to filter.  [500, #175002])

Does the server log always contain the error:

   svn: E160004: Corrupt node-revision '2-1.0.r137/330061'

Is it always '2-1.0.r137/330061'?

-- 
Philip Martin | Subversion Committer
WANdisco // *Non-Stop Data*


Re: Seeking help for "E000054: Error retrieving REPORT: Connection reset by peer"

2014-01-07 Thread Mojca Miklavec
On Tue, Jan 7, 2014 at 12:41 PM, Philip Martin wrote:
> Mojca Miklavec writes:
>
>> We have a server running Fedora which has recently been upgraded to
>> version 20 and it's now running
>> svn, version 1.8.5 (r1542147)
>>
>> I have a bunch of repositories served over http protocol with public
>> read access and limited commit access.
>>
>> Shortly after the upgrade a weird behaviour has been noticed. Running
>> "svn up" on the top level dir worked ok for me, but running
>> svn co http://svn.myserver.net/myrepo/dirA
>> fails with
>>
>> AdirA/subdir1
>> AdirA/subdir2
>> AdirA/subdir3
>> AdirA/subdir4
>> svn: E54: Error retrieving REPORT: Connection reset by peer
>>
>> The directory "dirA" contains one more file FILE.txt. Checking out any
>> individual "subdirN" works and the browser is able to display the
>> contents of dirA.
>>
>> Trying to click on FILE.txt in the browser sometimes works (it
>> currently does) and sometimes shows an XML (like a few minutes ago,
>> but I'm unable to get it now), saying something similar to the error I
>> get in console***:
>>
>> svn: E175002: Unable to connect to a repository at URL
>> 'svn.myserver.net/myrepo/dirA'
>> svn: E175002: Unexpected HTTP status 500 'Internal Server Error' on
>> '/myrepo/dirA'
>>
>> svn: E160004: Additional errors:
>> svn: E160004: Corrupt node-revision '2-1.0.r137/330061'
>>
>> (*** To be precise: this is the error I get after upgrading the
>> repository to the latest version of SVN, I didn't try to get to this
>> error before upgrading.)
>>
>> The error.log in apache says just:
>>
>> [] [dav:error] [pid 3613] [client ] Unable to deliver
>> content.  [500, #0]
>> [] [dav:error] [pid 3613] [client ] Could not write
>> data to filter.  [500, #175002]
>>
>> I first tried if upgrading the repository would help in any way, so I did
>> svnadmin dump oldrepo | svnadmin load newrepo
>> and checking the relevant revision r137 cited in the error all I see
>> is the following (nothing unusual):
>>
>> --- Committed revision 136 >>>
>>
>> <<< Started new transaction, based on original revision 137
>>  * editing path : dirA/FILE.txt ... done.
>> * Dumped revision 137.
>>  * editing path : dirA/subdir1/somefile ... done.
>>
>> --- Committed revision 137 >>>
>>
>> Checking out the same repository via http on the machine where the
>> repository itself is located works fine.
>>
>> I'm using the same version of SVN (1.8.5) on Mac, but other svn
>> clients on other OSes have problems as well.
>>
>> I tried checking the repository health with
>> svnadmin verify /path/to/myrepo
>> and all revisions passed except for some weird error inbetween (the
>> file rev-prop-atomics.mutex is actually missing, but it isn't present
>> in any other repository either):
>>
>> * Verifying repository metadata ...
>> * Verifying metadata at revision 1 ...
>> ...
>> * Verifying metadata at revision 155 ...
>> svnadmin: E160052: Revprop caching for '/path/to/myrepo/db' disabled
>> because SHM infrastructure for revprop caching failed to initialize.
>> svnadmin: E13: Can't open file
>> '/path/to/myrepo/db/rev-prop-atomics.mutex': Permission denied
>> * Verified revision 0.
>> ...
>> * Verified revision 160.
>>
>>
>> I would appreciate any help or debugging hints. If necessary I can
>> share the repository URL (but I would prefer to share it off-list to
>> anyone interested in debugging). I can also try to debug myself, but I
>> need some instructions telling me what to check. I didn't manage to
>> find anything useful by googling the errors other than figuring out
>> that the error was part of the code to fix a memory leak
>> (http://svn.haxx.se/dev/archive-2009-08/0274.shtml).
>
> I've not seen E54 before but it is EXFULL which is some sort of
> network error.  I suppose the corruption causes some sort of output
> problem.
>
> E13 is EACCES so you are running verify without write access to the
> repository.  That seems like a perfectly reasonable thing to do so we
> should probably make the warning less intimidating.
>
> It's very odd that Apache is reporting corruption but both the dump/load
> and verify work without problem.  Is the problem reproducible if you
> restart Apache?

Yes, there is still a problem after restarting Apache. Even though it
works for me at the moment and I tried fetching from multiple
locations and servers, other users are still experiencing the same
problem. Logs on the server confirm that. (Unable to deliver content.
[500, #0] + Could not write data to filter.  [500, #175002])

Mojca


Re: Seeking help for "E000054: Error retrieving REPORT: Connection reset by peer"

2014-01-07 Thread Philip Martin
Mojca Miklavec  writes:

> We have a server running Fedora which has recently been upgraded to
> version 20 and it's now running
> svn, version 1.8.5 (r1542147)
>
> I have a bunch of repositories served over http protocol with public
> read access and limited commit access.
>
> Shortly after the upgrade a weird behaviour has been noticed. Running
> "svn up" on the top level dir worked ok for me, but running
> svn co http://svn.myserver.net/myrepo/dirA
> fails with
>
> AdirA/subdir1
> AdirA/subdir2
> AdirA/subdir3
> AdirA/subdir4
> svn: E54: Error retrieving REPORT: Connection reset by peer
>
> The directory "dirA" contains one more file FILE.txt. Checking out any
> individual "subdirN" works and the browser is able to display the
> contents of dirA.
>
> Trying to click on FILE.txt in the browser sometimes works (it
> currently does) and sometimes shows an XML (like a few minutes ago,
> but I'm unable to get it now), saying something similar to the error I
> get in console***:
>
> svn: E175002: Unable to connect to a repository at URL
> 'svn.myserver.net/myrepo/dirA'
> svn: E175002: Unexpected HTTP status 500 'Internal Server Error' on
> '/myrepo/dirA'
>
> svn: E160004: Additional errors:
> svn: E160004: Corrupt node-revision '2-1.0.r137/330061'
>
> (*** To be precise: this is the error I get after upgrading the
> repository to the latest version of SVN, I didn't try to get to this
> error before upgrading.)
>
> The error.log in apache says just:
>
> [] [dav:error] [pid 3613] [client ] Unable to deliver
> content.  [500, #0]
> [] [dav:error] [pid 3613] [client ] Could not write
> data to filter.  [500, #175002]
>
> I first tried if upgrading the repository would help in any way, so I did
> svnadmin dump oldrepo | svnadmin load newrepo
> and checking the relevant revision r137 cited in the error all I see
> is the following (nothing unusual):
>
> --- Committed revision 136 >>>
>
> <<< Started new transaction, based on original revision 137
>  * editing path : dirA/FILE.txt ... done.
> * Dumped revision 137.
>  * editing path : dirA/subdir1/somefile ... done.
>
> --- Committed revision 137 >>>
>
> Checking out the same repository via http on the machine where the
> repository itself is located works fine.
>
> I'm using the same version of SVN (1.8.5) on Mac, but other svn
> clients on other OSes have problems as well.
>
> I tried checking the repository health with
> svnadmin verify /path/to/myrepo
> and all revisions passed except for some weird error inbetween (the
> file rev-prop-atomics.mutex is actually missing, but it isn't present
> in any other repository either):
>
> * Verifying repository metadata ...
> * Verifying metadata at revision 1 ...
> ...
> * Verifying metadata at revision 155 ...
> svnadmin: E160052: Revprop caching for '/path/to/myrepo/db' disabled
> because SHM infrastructure for revprop caching failed to initialize.
> svnadmin: E13: Can't open file
> '/path/to/myrepo/db/rev-prop-atomics.mutex': Permission denied
> * Verified revision 0.
> ...
> * Verified revision 160.
>
>
> I would appreciate any help or debugging hints. If necessary I can
> share the repository URL (but I would prefer to share it off-list to
> anyone interested in debugging). I can also try to debug myself, but I
> need some instructions telling me what to check. I didn't manage to
> find anything useful by googling the errors other than figuring out
> that the error was part of the code to fix a memory leak
> (http://svn.haxx.se/dev/archive-2009-08/0274.shtml).

I've not seen E54 before but it is EXFULL which is some sort of
network error.  I suppose the corruption causes some sort of output
problem.

E13 is EACCES so you are running verify without write access to the
repository.  That seems like a perfectly reasonable thing to do so we
should probably make the warning less intimidating.

It's very odd that Apache is reporting corruption but both the dump/load
and verify work without problem.  Is the problem reproducible if you
restart Apache?

-- 
Philip Martin | Subversion Committer
WANdisco // *Non-Stop Data*