On Tue, Oct 02, 2001 at 08:33:21PM -0500, William A.  Rowe, Jr.  wrote:

>   I am very impressed by this idea for Apache 2.0.  But I don't like the
> many to many mapping.  If we change your underlying rule here to require
> that each filename extension is passed in sequence, I would be _very_ 
> happy to commit this patch :)  E.g. index.en _could_ match index.html.en.
> But index.en.html would _not_ match index.html.en.

By that, I take it you mean something like a requirement (5), or
perhaps (3b), to be added to the description below, along the lines of
(3b)"each .suffix in r->filename must exist in the real filename, in
the same sequence as they were in r->filename"?  (r->filename here
means "the bit of it after the final /")

>> The requirements are (1)r->filename up to the first dot must match the
>> real filename up to the first dot; (2)r->filename may not be longer
>> than the real filename; (3)each .suffix in r->filename must exist
>> (string match) in the real filename; (4)the real filename must
>> correspond to a known mime-type, encoding, etc -- which I think means
>> that the final suffix must be known, and only suffixes followed by
>> known suffixes are considered.

[ I note that others feel that (1) above should be replaced with
something more like "all of r->filename must match the start of the
real filename", which would make the remainder of this mail
irrelevant.  I'll continue anyway, but feel free to bin it if this is
the Wrong Thing ]

In case my interpretation of (3b) is unclear: as a (hopefully)
complete example, given the file "name.a.b.x.cd.e.f", and presuming
that "x" is the only suffix which is _not_ a recognised mime extension
(type, language, encoding, whatever) then which of the following
requests should be accepted, and which not?

name
name.a
name.a.b
name.a.c
name.a.cd
name.b.c.e
name.b.x.e.f
name.e
name.x.c
name.x.f
name.a.b...f

name.b.a
name.a.b.f.c
name.x.b
name.c.x
name.f.e

name.e.
name.e.e
name.f.

(my understanding is that the first group should all be passed down as
possibilities, the second group shouldn't, and the third group could
be anything.  I'd plump for "yes" for all three, probably.)

And for extra fun, which would be different if the file were called
"name.a.b.x.cd.e.f.e".

(my understanding is that the third group definitely becomes "yes",
name.f.e from the second group becomes "yes", and the rest stay as
they were.)

As mentioned in the earlier mail, this patch just decides whether or
not to allow the file as a possibility -- later code gets a shot at
deciding how to handle the suffixes, so if any of the trailing
not-explicitly-listed-in-r->filename suffixes aren't actually
recognised, the only way to get the file would be to request it
by the full name, and therefore bypass mod_negotiation.

For the specific example above, this means that the only requests that
would actually return the file would be name.b.x.e.f, name.x.c, and
name.x.f

The change to the patch to limit the matches as described above is
mostly straightforward -- instead of starting each strstr() at the
start of "name", start it at the point of the previous match (either
the start or the end -- it'd presumably make a difference if someone
requests "file.html.html").

A new const char * which points into dirent.name is the only addition
over the previous patch.  However, unless someone has a
case-insensitive strstr() lying around, the CASE_BLIND_FILESYSTEM
cases won't work sensibly -- the "name" part would match
insensitively, but each suffix won't.

I'm including the reworked patch below, in case it's considered
useful.  Written and somewhat tested against mod_negotiation.c from
httpd-2.0.25; it applies cleanly to CVS version 1.84.

        f
-- 
Francis Daly        [EMAIL PROTECTED]

=============================

--- mod_negotiation.c.orig      Tue Aug 28 04:08:31 2001
+++ mod_negotiation.c   Wed Oct  3 21:44:12 2001
@@ -1019,6 +1019,11 @@
     struct var_rec mime_info;
     struct accept_rec accept_info;
     void *new_var;
+    char *pos;
+    int pos_len;
+    int not_this_dirent;        /* actually, boolean. */
+    int dots_in_request = 0;    /* 1 == one dot, 2 == some dots */
+    const char *dpos;           /* points into the dirent.name */
     int anymatch = 0;
 
     clean_var_rec(&mime_info);
@@ -1041,20 +1046,92 @@
         return HTTP_FORBIDDEN;
     }
 
+    if ((pos = strchr(filp, '.'))) {
+        dots_in_request = 1;
+        if (strchr(++pos, '.')) {
+            dots_in_request = 2;
+        }
+    }
+
     while (apr_dir_read(&dirent, APR_FINFO_DIRENT, dirp) == APR_SUCCESS) {
         apr_array_header_t *exception_list;
         request_rec *sub_req;
         
-        /* Do we have a match? */
+        if (!dots_in_request) {
+
+            /* Given "name", check for "name." */
 #ifdef CASE_BLIND_FILESYSTEM
-        if (strncasecmp(dirent.name, filp, prefix_len)) {
+            if (strncasecmp(dirent.name, filp, prefix_len)) {
 #else
-        if (strncmp(dirent.name, filp, prefix_len)) {
+            if (strncmp(dirent.name, filp, prefix_len)) {
 #endif
-            continue;
-        }
-        if (dirent.name[prefix_len] != '.') {
-            continue;
+                continue;
+            }
+            if (dirent.name[prefix_len] != '.') {
+                continue;
+            }
+
+        } else {
+
+            /* Given "name.suffixes", check for "name." */
+            pos = strchr(filp, '.');
+            pos_len = pos - filp + 1;
+#ifdef CASE_BLIND_FILESYSTEM
+            if (strncasecmp(dirent.name, filp, pos_len)) {
+#else
+            if (strncmp(dirent.name, filp, pos_len)) {
+#endif
+                continue;
+            }
+
+            /* Next search can start at the first dot in dirent.name */
+            dpos = &dirent.name[pos_len-1];
+            not_this_dirent = 0;
+            filp = ++pos;
+
+            /* Given "name.suf1.suf2.suffix", check for each ".sufN",
+               somewhere after the previous match */
+            if (2 == dots_in_request) {
+                /* Give up now if the request is longer than the file */
+                if (prefix_len > strlen(dirent.name)) {
+                    filp -= pos_len;
+                    continue;
+                }
+
+                while ((pos = strchr(filp, '.'))) {
+
+                    --filp;
+                    pos_len = pos - filp ;
+                    filp[pos_len]='\0';
+                    if ((dpos = strstr(dpos, filp)) == NULL) {
+                        not_this_dirent=1;
+                    }
+
+                    filp[pos_len] = '.';
+                    filp += pos_len + 1;
+                    
+                    if (not_this_dirent) {
+                        /* get to next dirent */
+                        break;
+                    }
+                }
+                if (not_this_dirent) {
+                    /* reset filp before trying next dirent */
+                    pos_len = strlen(filp);
+                    filp -= prefix_len - pos_len;
+                    continue;
+                }
+            }
+            --filp;
+            pos_len = strlen(filp);
+
+            /* Check for the final ".suffix" */
+
+            if (!strstr(dpos, filp)) {
+                filp -= prefix_len - pos_len;
+                continue;
+            }
+            filp -= prefix_len - pos_len;
         }
 
         /* Ok, something's here.  Maybe nothing useful.  Remember that

Reply via email to