Re: mod_dav_svn changing the request filename - interaction with mod_rewrite

2013-12-12 Thread Thomas Åkesson

On 11 dec 2013, at 22:22, Ben Reser b...@reser.org wrote:

 On 12/11/13 10:45 AM, Ben Reser wrote:
 Hmm this is going to be a pain to fix (possibly impossible).  Because what
 mod_rewrite is doing is really hackish.  When you use the PT (PassThrough) 
 flag
 mod_rewrite puts passthrough:/my/new/URL into r-filename.  Then eventually 
 it
 moves it to r-uri, but not until after it goes through our translate_name
 hook.  So we can't reliably know if the request is going to be served by us 
 or
 not by asking for the per_dir config.
 
 Okay this isn't strictly true.  r-uri is set before it gets to us.  It's just
 that mod_rewrite doesn't trigger a new location walk (that happens after the
 map_to_storage hooks run which is the next step after the translate_name 
 hook).
 In order for mod_rewrite to work transparently the location_walk would need to
 be triggered again.

I am not very familiar with httpd internals, but I would have expected that PT 
flag would cause a completely new round of translate_name and map_to_storage 
(i.e. one round with original url and one round with rewritten url). Have you 
confirmed that is not the case? 

If it does trigger a new round, wouldn't it be fine to just do nothing if a 
PT-rewrite is detected? 

 
 We could do that on our side with something like this:
...
 
 But I really don't think we should for several reasons.
 
 First of all it starts messing with httpd internals that I don't think we
 should really need to be doing.  Note that I had to include the ap_if_walk()
 for 2.3.x+.  This sort of fix would have to forever be kept in sync with the
 code in ap_process_request_internal().  Making it rather brittle.

Yes, I can see your concern here.

 
 So at this point I think this needs to be taken to the d...@httpd.apache.org
 list.  I'll start a new thread over there.

Thanks for looking into the matter.

 None of these use cases are strictly legal.  The namespace both path wise and
 query argument wise belong to SVN.  Overlaying your own paths or query
 arguments is always going to be prone to future incompatibilities.  You've 
 just
 finally been burned.
 
 The right way to do what you've been doing is some other path like (where 
 /view
 is not configured to be handled by Subversion):
 http://svn.example.com/view/details/repo/folder
 or
 http://svn.example.com/view/repo/folder?details

We see SVN as a component in a larger context. Svn is the core versioned 
storage of our CMS-system, where we provide a number of additional services. 
Keeping consistent paths significantly reduces end-user confusion and  what we 
consider a more RESTful set of URLs. 

I do agree that the namespaces belong to SVN. We use a single parameter to 
indicate the operation/view and definitely not a single-letter one (we never 
touch the paths). A concept similar to property namespaces would be ideal, e.g. 
cms-view=details (or x-view=details similar to non-registered URL prefix).

Thanks,
Thomas Å.

Re: mod_dav_svn changing the request filename - interaction with mod_rewrite

2013-12-12 Thread Ben Reser
On 12/12/13 6:39 AM, Thomas Åkesson wrote:
 I am not very familiar with httpd internals, but I would have expected that 
 PT flag would cause a completely new round of translate_name and 
 map_to_storage (i.e. one round with original url and one round with rewritten 
 url). Have you confirmed that is not the case? 

That is confirmed.  All PT does is rewrite the URI and then return DECLINED in
mod_rewrite's translate_name hook which allows the other translate_name hooks
to run (translate_name runs until one hook returns OK, which the core hook
usually does), intending to ultimately get to the core one which is the final
hook.  Subversion is preventing that final hook from running (by returning OK)
in order to set our own r-filename.  We also set a flag on the request to
specify that we should prevent the core map_to_storage() hook from running,
which we do by also returning OK.  The core map_to_storage() hook has a
directory and file walk in it that matches the Directory blocks which we want
to prevent.

The problem here is that I have to do the processing in translate_name and
map_to_storage hooks.  But I can't be assured that when I ask for the per_dir
configuration from httpd for my module for the request that it will be accurate
because mod_rewrite has changed r-uri without running a new location walk.
The 2nd location walk will ultimately happen but not until after translate_name
and map_to_storage are done, which is too late for me to solve the problem we
made this change for.

Anyone reading this that isn't super familiar with request processing in httpd
may want to read this:
https://httpd.apache.org/docs/2.4/developer/request.html

 We see SVN as a component in a larger context. Svn is the core versioned 
 storage of our CMS-system, where we provide a number of additional services. 
 Keeping consistent paths significantly reduces end-user confusion and  what 
 we consider a more RESTful set of URLs. 
 
 I do agree that the namespaces belong to SVN. We use a single parameter to 
 indicate the operation/view and definitely not a single-letter one (we 
 never touch the paths). A concept similar to property namespaces would be 
 ideal, e.g. cms-view=details (or x-view=details similar to non-registered URL 
 prefix).

Nod.  I understand the motivations.  It doesn't seem like an unreasonable thing
to allow provided it doesn't create problems for us.

The only problem here is with PT.  You absolutely can use external redirects.
You can also use P (PROXY) to get around this as well, though it'll generate a
new request at the new URI (and you'll need to have mod_proxy and
mod_proxy_http loaded).  E.G.

RewriteEngine On
RewriteCond %{REQUEST_URI} ^/svn/
RewriteCond %{QUERY_STRING} view
RewriteRule (.*) http://127.0.0.1:8081/view.php?path=$1 [P,QSA]

The patch I posted in my last email is reasonably safe for all of the included
modules with httpd.  There are no translate_name hooks registered to the
APR_HOOK_LAST bucket (like we are) leaving the only hook that follows us as the
core translate_name hook, which doesn't use per_dir configuration.  The per_dir
config is wiped out before the map_to_storage hook is run, so the end of the
translate_name hooks is the end of that patch potentially causing problems.
The only place where you may run into problems is other 3rd party modules
(including modules written using mod_lua that is included with httpd 2.3.x+).
But not very many modules hook translate_name.  Of course it would have to be
maintained and I'm not convinced we should apply it to our own code.

That's the best advice I can give for now.


Re: mod_dav_svn changing the request filename - interaction with mod_rewrite

2013-12-11 Thread Ben Reser
On 12/11/13 5:18 AM, Thomas Åkesson wrote:
 Hi,
 Revision 1512432 causes a regression when mod_dav_svn is used together with 
 mod_rewrite, which we have done successfully since Subversion 1.5. I have 
 also studied the follow up commits which change the approach somewhat from 
 setting filename to null into a bogus file.
 
 Use case:
 Using mod_rewrite we can serve information pages using the correct 
 Subversion URL.
 
 Example:
 http://svn.example.com/svn/repo/folder/?view=details
 
 This page might be served by PHP or other script which is located elsewhere 
 on the server (say /details/index.php). By using a rewrite with PT flag this 
 page can be displayed without redirecting to a separate URL. It was possible 
 before 1.8.3.
 
 - In 1.8.1 works.
 - In 1.8.3 the filename is set to null and we get a 404.
 - In 1.8.5 the filename is set to something like 
 'svn:/pathtorepositories/repo/details/'. Also 404.
 
 The PT flag sends the rewritten URI  back through URL mapping. For some 
 reason, the recently introduced hook dav_svn__translate_name is executed 
 despite the rewritten URL being outside of the /svn Location. Consequently 
 the request filename field is modified by mod_dav_svn.
 
 My best guess is that the sequence becomes:
 - mod_rewrite gets translate_name hook, changes the URI.
 - mod_dav_svn gets translate_name hook, because it was initially expected to 
 handle the request.
 - renewed URL mapping finds that mod_dav_svn should not handle, selects 
 another one, e.g. mod_php.
 - ...
 
 I did some searching in dev-list for discussions on these changes, but I 
 didn't find much. Any recommended reading?
 
 Perhaps the translate_name hook needs to decline in some additional 
 situations, e.g. when the requested path is actually no longer going to be 
 served by mod_dav_svn.
 
 We have failed to find a workaround for this issue so it is a blocker for us 
 to upgrade our production servers beyond 1.8.1.

Hmm this is going to be a pain to fix (possibly impossible).  Because what
mod_rewrite is doing is really hackish.  When you use the PT (PassThrough) flag
mod_rewrite puts passthrough:/my/new/URL into r-filename.  Then eventually it
moves it to r-uri, but not until after it goes through our translate_name
hook.  So we can't reliably know if the request is going to be served by us or
not by asking for the per_dir config.

If we disable handling requests that have a passthrough filename then if the
passthrough ends up coming back to us (e.g. /svn/repo/b is mapped to
/svn/repo/a) then the request ends up being mapped to a bogus filesystem path.
 If we enable it then requests (like in your configuration) that aren't going
to be handled by us end up not getting mapped properly.

The following patch (code stolen from mod_charset_lite) avoids the problem
you're having but due to the above isn't a good solution.

[[[
Index: subversion/mod_dav_svn/mod_dav_svn.c
===
--- subversion/mod_dav_svn/mod_dav_svn.c(revision 1549924)
+++ subversion/mod_dav_svn/mod_dav_svn.c(working copy)
@@ -1127,6 +1127,15 @@
   if (!conf-fs_path  !conf-fs_parent_path)
 return DECLINED;

+  /* mod_rewrite indicators */
+  if (r-filename
+   (!strncmp(r-filename, redirect:, 9)
+  || !strncmp(r-filename, gone:, 5)
+  || !strncmp(r-filename, passthrough:, 12)
+  || !strncmp(r-filename, forbidden:, 10))) {
+  return DECLINED;
+  }
+
   if (dav_svn__is_parentpath_list(r))
 {
   /* SVNListParentPath is on and the request is for the conf-root_dir,
]]]

None of these use cases are strictly legal.  The namespace both path wise and
query argument wise belong to SVN.  Overlaying your own paths or query
arguments is always going to be prone to future incompatibilities.  You've just
finally been burned.

The right way to do what you've been doing is some other path like (where /view
is not configured to be handled by Subversion):
http://svn.example.com/view/details/repo/folder
or
http://svn.example.com/view/repo/folder?details

I'll see what I can do but I think you may just have to bite the bullet and
change your configuration.


Re: mod_dav_svn changing the request filename - interaction with mod_rewrite

2013-12-11 Thread Ben Reser
On 12/11/13 10:45 AM, Ben Reser wrote:
 Hmm this is going to be a pain to fix (possibly impossible).  Because what
 mod_rewrite is doing is really hackish.  When you use the PT (PassThrough) 
 flag
 mod_rewrite puts passthrough:/my/new/URL into r-filename.  Then eventually it
 moves it to r-uri, but not until after it goes through our translate_name
 hook.  So we can't reliably know if the request is going to be served by us or
 not by asking for the per_dir config.

Okay this isn't strictly true.  r-uri is set before it gets to us.  It's just
that mod_rewrite doesn't trigger a new location walk (that happens after the
map_to_storage hooks run which is the next step after the translate_name hook).
 In order for mod_rewrite to work transparently the location_walk would need to
be triggered again.

We could do that on our side with something like this:
[[[
Index: subversion/mod_dav_svn/mod_dav_svn.c
===
--- subversion/mod_dav_svn/mod_dav_svn.c(revision 1549924)
+++ subversion/mod_dav_svn/mod_dav_svn.c(working copy)
@@ -1121,8 +1121,21 @@
   const char *fs_path, *repos_basename, *repos_path;
   const char *ignore_cleaned_uri, *ignore_relative_path;
   int ignore_had_slash;
-  dir_conf_t *conf = ap_get_module_config(r-per_dir_config, dav_svn_module);
+  dir_conf_t *conf;

+  /* rewalk the location blocks if mod_rewrite rewrote something */
+  if (apr_table_get(r-notes, mod_rewrite_rewritten))
+{
+  r-per_dir_config = r-server-lookup_defaults;
+  ap_location_walk(r);
+#if AP_MODULE_MAGIC_AT_LEAST(20110117,0)
+  ap_if_walk(r);
+#endif
+}
+
+  /* Now that we have an accurate per_dir_config retrieve our config */
+  conf = ap_get_module_config(r-per_dir_config, dav_svn_module);
+
   /* module is not configured, bail out early */
   if (!conf-fs_path  !conf-fs_parent_path)
 return DECLINED;
]]]

But I really don't think we should for several reasons.

First of all it starts messing with httpd internals that I don't think we
should really need to be doing.  Note that I had to include the ap_if_walk()
for 2.3.x+.  This sort of fix would have to forever be kept in sync with the
code in ap_process_request_internal().  Making it rather brittle.

This changes the conf available to all the other modules that follow us in the
translate_name hooks.  Since we have to do this before we can tell if we're
configured, this impacts all traffic not just us.  We could go to the effort of
storing the state and then putting things back if we DECLINED, but that's not
as simple as it seems.  There is a cache for the location and if walks that I'm
not sure if we have to restore that.

It seems to me that this is something that mod_rewrite should be doing and if
since it isn't there's probably a really good reason for it not to do this.

So at this point I think this needs to be taken to the d...@httpd.apache.org
list.  I'll start a new thread over there.