Re: Disallow path info
Szekeres Jr., Edward wrote: I am not sure why something as simple as this would be considered "dicing and slicing", simply blocks any requests with "path[any character]info" in them I already apologised for that comment. RewriteEngine On RewriteCond %{REQUEST_URI} ^.*(path.info).* [NC] RewriteRule ^(.*)$ - [F] What would you put instead of the (path.info) above ? -Original Message- From: Phillip Hellewell [mailto:ssh...@gmail.com] Sent: Tuesday, March 24, 2015 11:12 AM To: mod_perl list Subject: Re: Disallow path info On Tue, Mar 24, 2015 at 3:44 AM, André Warnier wrote: You know, I am slowly getting the feeling that by dicing and slicing the URLs and fixing up things afterward, you are setting yourself up for some major headeaches later on, when something changes and/or someone needs to figure out what is going on. Umm, no I am not dicing and slicing URLs and fixing up things afterwards. If I tried to solve this problem using rewrite module, that would be dicing and slicing and I'm not confident I would get the regex right. If I tried to modify all my scripts to handle path_info and alter the URLs inside the returned HTML by, e.g., prepend ../ for however many parts are in the path info *that* would really be dicing and slicing and I would probably mess up somewhere. No, I'm not doing either of those. I'm doing a 3-line PerlFixupHandler to return 404 if path_info is set, which is the perfect fix for the fact that ModPerl ignores "Accept PathInfo off" apache directive for some reason. It may be time to reconsider the issue "top-down", and maybe see if there is not a "more standard" way of achieving what you wanted to achieve in the first place (which I honestly have lost track of by now). I already made the issue very clear. Just go read my first email. The only "more standard" way of achieving what I want would be for ModPerl to stop ignoring the "AcceptPathInfo off" apache directive. I mean, mod_perl is great, in that it allows one to do just that kind of thing, at a relatively deep level within the Apache logic itself. But there lies also the danger of incrementally building up your very own webserver with its very own logic, which at some point does not match anymore what a HTTP-compliant webserver should do in some particular circumstance, and becomes very fragile and difficult to maintain or adapt to some new quirk which would appear on the WWW. Umm, no. I'm not using modperl to accomplish some non-standard thing. An HTTP-compliant webserver is free to allow or disallow paths like http://mysite.example.com/myscript.pl/path/info, and Apache already supports either allowing or disallowing that, and I can achieve exactly what I want without ModPerl using the "AcceptPathInfo off" directive. It is when I add ModPerl to the mix that the problem comes back because ModPerl ignores "AcceptPathInfo off" and that is a problem with ModPerl not a problem with what I am trying to achieve, so fixing a problem caused by ModPerl with a ModPerl fixup handler makes perfect sense. For example, I believe that it is entirely HTTP-compliant for a URL to invoke a script and nevertheless pass it path-info information at the same time, for the script to use in some way. In this particular case, should it not be the script which discards the unwanted path_info if it doesn't want it ? I am not saying that this is the "right" answer, but it is maybe worth pondering, before introducing additional webserver logic which might have unintended side-effects in other cases. I believe you are right, it is entirely HTTP-compliant to support path-info. But I believe it is also entirely HTTP-compliant to not support it. My scripts already discard path_info, but that doesn't solve the problem I explained, which is that if you enter http://mysite.example.com/myscript.pl/path/info, then if my script returns HTML that includes then your web browser will try to download the css at http://mysite.example.com/myscript.pl/path/info/css/default.css, which is wrong and will return HTML from my script again instead of the actual css which is at http://mysite.example.com/css/default.css. path-info may work for you, but it doesn't work for me, and I don't need it and I don't use it and I don't want it and it is entirely legitimate for me to not support it. Phillip
RE: Disallow path info
I am not sure why something as simple as this would be considered "dicing and slicing", simply blocks any requests with "path[any character]info" in them RewriteEngine On RewriteCond %{REQUEST_URI} ^.*(path.info).* [NC] RewriteRule ^(.*)$ - [F] -Original Message- From: Phillip Hellewell [mailto:ssh...@gmail.com] Sent: Tuesday, March 24, 2015 11:12 AM To: mod_perl list Subject: Re: Disallow path info On Tue, Mar 24, 2015 at 3:44 AM, André Warnier wrote: > You know, I am slowly getting the feeling that by dicing and slicing the > URLs and fixing up things afterward, you are setting yourself up for some > major headeaches later on, when something changes and/or someone needs to > figure out what is going on. Umm, no I am not dicing and slicing URLs and fixing up things afterwards. If I tried to solve this problem using rewrite module, that would be dicing and slicing and I'm not confident I would get the regex right. If I tried to modify all my scripts to handle path_info and alter the URLs inside the returned HTML by, e.g., prepend ../ for however many parts are in the path info *that* would really be dicing and slicing and I would probably mess up somewhere. No, I'm not doing either of those. I'm doing a 3-line PerlFixupHandler to return 404 if path_info is set, which is the perfect fix for the fact that ModPerl ignores "Accept PathInfo off" apache directive for some reason. > It may be time to reconsider the issue "top-down", and maybe see if there is > not a "more standard" way of achieving what you wanted to achieve in the > first place (which I honestly have lost track of by now). I already made the issue very clear. Just go read my first email. The only "more standard" way of achieving what I want would be for ModPerl to stop ignoring the "AcceptPathInfo off" apache directive. > I mean, mod_perl is great, in that it allows one to do just that kind of > thing, at a relatively deep level within the Apache logic itself. But there > lies also the danger of incrementally building up your very own webserver > with its very own logic, which at some point does not match anymore what a > HTTP-compliant webserver should do in some particular circumstance, and > becomes very fragile and difficult to maintain or adapt to some new quirk > which would appear on the WWW. Umm, no. I'm not using modperl to accomplish some non-standard thing. An HTTP-compliant webserver is free to allow or disallow paths like http://mysite.example.com/myscript.pl/path/info, and Apache already supports either allowing or disallowing that, and I can achieve exactly what I want without ModPerl using the "AcceptPathInfo off" directive. It is when I add ModPerl to the mix that the problem comes back because ModPerl ignores "AcceptPathInfo off" and that is a problem with ModPerl not a problem with what I am trying to achieve, so fixing a problem caused by ModPerl with a ModPerl fixup handler makes perfect sense. > For example, I believe that it is entirely HTTP-compliant for a URL to > invoke a script and nevertheless pass it path-info information at the same > time, for the script to use in some way. In this particular case, should it > not be the script which discards the unwanted path_info if it doesn't want > it ? I am not saying that this is the "right" answer, but it is maybe worth > pondering, before introducing additional webserver logic which might have > unintended side-effects in other cases. I believe you are right, it is entirely HTTP-compliant to support path-info. But I believe it is also entirely HTTP-compliant to not support it. My scripts already discard path_info, but that doesn't solve the problem I explained, which is that if you enter http://mysite.example.com/myscript.pl/path/info, then if my script returns HTML that includes then your web browser will try to download the css at http://mysite.example.com/myscript.pl/path/info/css/default.css, which is wrong and will return HTML from my script again instead of the actual css which is at http://mysite.example.com/css/default.css. path-info may work for you, but it doesn't work for me, and I don't need it and I don't use it and I don't want it and it is entirely legitimate for me to not support it. Phillip
RE: Disallow path info
Sorry for late or redundant response Look into Apache rewrite module, you can get fairly fine tooth control there, I use it to disallow web based access to mount points that have to be served from within the server accessible tree…. edward From: Lathan Bidwell [mailto:lat...@andrews.edu] Sent: Monday, March 23, 2015 10:05 PM To: Phillip Hellewell Cc: rand...@modperl.pl; mod_perl list; Randolf Richardson Subject: Re: Disallow path info Out of curiosity, Are there links that actually point to /myscript.pl/path/info/.<http://myscript.pl/path/info/>.. ? Because if you are trying to block them, then it sounds like you don't want to link to them either. Would it be possible to find how they are reaching that page and change the links? Another perspective: If you change the links, then if they somehow get there...then they get a broken page On Mon, Mar 23, 2015 at 6:16 PM, Phillip Hellewell mailto:ssh...@gmail.com>> wrote: On Mon, Mar 23, 2015 at 3:08 PM, Randolf Richardson mailto:rand...@modperl.pl>> wrote: > Can you provide some additional detail about what you're doing? I'm just trying to secure my website, and one problem right now is if someone enters http://mysite.example.com/myscript.pl/path/info , not only does it work (which I don't want), but the page is formatted all wrong because it can't find the css, because my HTML uses a relative path, like this: and because of the path of the URL, the web browser looks here for it: http://mysite.example.com/myscript.pl/path/info/css/default.css instead of here where it's actually at: http://mysite.example.com/css/default.css And worst of all, that really long crazy path also works, and returns HTML from my script, and it is not CSS. This affects some js files that I include from my HTML too. What's scary is the fact that this means the browser tries to interpret the HTML that gets returned as both css and javascript. And yes, I already know I can make this work probably by using absolute paths for (but actually I can't, because in my real use case there is a parent folder in the path that comes before the script, on some servers but not others). But I don't want to do a bunch of workarounds to "make it work". I want the user to get a 404. > As far as I understand it, with "AcceptPathInfo Off" in efffect, the > "/path/info" portion should cause a 404 error. Of course, if this > doesn't work on your system, one possible work-around would be to Yes, and it does work on my system when using the default cgi handler. It only does not work when I am using ModPerl handler. > check that $r->path_info is empty and do the following if it isn't: > > $r->status(Apache2::Const::HTTP_NOT_FOUND); > return Apache2::Const::HTTP_NOT_FOUND; What is $r and where does it come from? I know I can check the PATH_INFO env var from my perl script, but my goal is to *not* have to modify all my .pl scripts to do extra checking. Thanks, Phillip
Re: Disallow path info
Phillip Hellewell wrote: On Tue, Mar 24, 2015 at 3:44 AM, André Warnier wrote: You know, I am slowly getting the feeling that by dicing and slicing the URLs and fixing up things afterward, you are setting yourself up for some major headeaches later on, when something changes and/or someone needs to figure out what is going on. Umm, no I am not dicing and slicing URLs and fixing up things afterwards. If I tried to solve this problem using rewrite module, that would be dicing and slicing and I'm not confident I would get the regex right. If I tried to modify all my scripts to handle path_info and alter the URLs inside the returned HTML by, e.g., prepend ../ for however many parts are in the path info *that* would really be dicing and slicing and I would probably mess up somewhere. No, I'm not doing either of those. I'm doing a 3-line PerlFixupHandler to return 404 if path_info is set, which is the perfect fix for the fact that ModPerl ignores "Accept PathInfo off" apache directive for some reason. It may be time to reconsider the issue "top-down", and maybe see if there is not a "more standard" way of achieving what you wanted to achieve in the first place (which I honestly have lost track of by now). I already made the issue very clear. Just go read my first email. The only "more standard" way of achieving what I want would be for ModPerl to stop ignoring the "AcceptPathInfo off" apache directive. I mean, mod_perl is great, in that it allows one to do just that kind of thing, at a relatively deep level within the Apache logic itself. But there lies also the danger of incrementally building up your very own webserver with its very own logic, which at some point does not match anymore what a HTTP-compliant webserver should do in some particular circumstance, and becomes very fragile and difficult to maintain or adapt to some new quirk which would appear on the WWW. Umm, no. I'm not using modperl to accomplish some non-standard thing. An HTTP-compliant webserver is free to allow or disallow paths like http://mysite.example.com/myscript.pl/path/info, and Apache already supports either allowing or disallowing that, and I can achieve exactly what I want without ModPerl using the "AcceptPathInfo off" directive. It is when I add ModPerl to the mix that the problem comes back because ModPerl ignores "AcceptPathInfo off" and that is a problem with ModPerl not a problem with what I am trying to achieve, so fixing a problem caused by ModPerl with a ModPerl fixup handler makes perfect sense. For example, I believe that it is entirely HTTP-compliant for a URL to invoke a script and nevertheless pass it path-info information at the same time, for the script to use in some way. In this particular case, should it not be the script which discards the unwanted path_info if it doesn't want it ? I am not saying that this is the "right" answer, but it is maybe worth pondering, before introducing additional webserver logic which might have unintended side-effects in other cases. I believe you are right, it is entirely HTTP-compliant to support path-info. But I believe it is also entirely HTTP-compliant to not support it. My scripts already discard path_info, but that doesn't solve the problem I explained, which is that if you enter http://mysite.example.com/myscript.pl/path/info, then if my script returns HTML that includes then your web browser will try to download the css at http://mysite.example.com/myscript.pl/path/info/css/default.css, which is wrong and will return HTML from my script again instead of the actual css which is at http://mysite.example.com/css/default.css. path-info may work for you, but it doesn't work for me, and I don't need it and I don't use it and I don't want it and it is entirely legitimate for me to not support it. Sorry, I did not mean to offend you or attack your strategy here. I was just trying to alert you to possible pitfalls, for having been there myself before : you start with a tweak to achieve one particular goal, and then you find that in one case it doesn't work as expected, so you add another tweak, and before you know it, you find yourself in a hopeless spiral of tweak over tweak.. I believe that I also got a bit confused, and mixed up memories of a previous post to the list (entitled "How Do I change the Document Root Per Request"), and thought that your post was a follow-up on that one (which would have made it a lot of tweaks). Sorry again.
Re: Disallow path info
On Tue, Mar 24, 2015 at 9:29 AM, Szekeres Jr., Edward wrote: > I am not sure why something as simple as this would be considered "dicing > and slicing", simply blocks any requests with "path[any character]info" in > them > > RewriteEngine On > RewriteCond %{REQUEST_URI} ^.*(path.info).* [NC] > RewriteRule ^(.*)$ - [F] The /path/info was just an example. I need to return 404 when any extra stuff is after the script, even just a single slash, e.g. http://mysite.example.com/myscript.pl/path/info. http://mysite.example.com/myscript.pl/junk http://mysite.example.com/myscript.pl/hello/there http://mysite.example.com/myscript.pl/wrong.jpeg http://mysite.example.com/myscript.pl/ All of those should return 404 (and now they do with the fix I have in place). Phillip
Re: Disallow path info
On Tue, Mar 24, 2015 at 3:44 AM, André Warnier wrote: > You know, I am slowly getting the feeling that by dicing and slicing the > URLs and fixing up things afterward, you are setting yourself up for some > major headeaches later on, when something changes and/or someone needs to > figure out what is going on. Umm, no I am not dicing and slicing URLs and fixing up things afterwards. If I tried to solve this problem using rewrite module, that would be dicing and slicing and I'm not confident I would get the regex right. If I tried to modify all my scripts to handle path_info and alter the URLs inside the returned HTML by, e.g., prepend ../ for however many parts are in the path info *that* would really be dicing and slicing and I would probably mess up somewhere. No, I'm not doing either of those. I'm doing a 3-line PerlFixupHandler to return 404 if path_info is set, which is the perfect fix for the fact that ModPerl ignores "Accept PathInfo off" apache directive for some reason. > It may be time to reconsider the issue "top-down", and maybe see if there is > not a "more standard" way of achieving what you wanted to achieve in the > first place (which I honestly have lost track of by now). I already made the issue very clear. Just go read my first email. The only "more standard" way of achieving what I want would be for ModPerl to stop ignoring the "AcceptPathInfo off" apache directive. > I mean, mod_perl is great, in that it allows one to do just that kind of > thing, at a relatively deep level within the Apache logic itself. But there > lies also the danger of incrementally building up your very own webserver > with its very own logic, which at some point does not match anymore what a > HTTP-compliant webserver should do in some particular circumstance, and > becomes very fragile and difficult to maintain or adapt to some new quirk > which would appear on the WWW. Umm, no. I'm not using modperl to accomplish some non-standard thing. An HTTP-compliant webserver is free to allow or disallow paths like http://mysite.example.com/myscript.pl/path/info, and Apache already supports either allowing or disallowing that, and I can achieve exactly what I want without ModPerl using the "AcceptPathInfo off" directive. It is when I add ModPerl to the mix that the problem comes back because ModPerl ignores "AcceptPathInfo off" and that is a problem with ModPerl not a problem with what I am trying to achieve, so fixing a problem caused by ModPerl with a ModPerl fixup handler makes perfect sense. > For example, I believe that it is entirely HTTP-compliant for a URL to > invoke a script and nevertheless pass it path-info information at the same > time, for the script to use in some way. In this particular case, should it > not be the script which discards the unwanted path_info if it doesn't want > it ? I am not saying that this is the "right" answer, but it is maybe worth > pondering, before introducing additional webserver logic which might have > unintended side-effects in other cases. I believe you are right, it is entirely HTTP-compliant to support path-info. But I believe it is also entirely HTTP-compliant to not support it. My scripts already discard path_info, but that doesn't solve the problem I explained, which is that if you enter http://mysite.example.com/myscript.pl/path/info, then if my script returns HTML that includes then your web browser will try to download the css at http://mysite.example.com/myscript.pl/path/info/css/default.css, which is wrong and will return HTML from my script again instead of the actual css which is at http://mysite.example.com/css/default.css. path-info may work for you, but it doesn't work for me, and I don't need it and I don't use it and I don't want it and it is entirely legitimate for me to not support it. Phillip
Re: Disallow path info
Phillip Hellewell wrote: Good news. I got a helpful tip from a Dr James Smith to use a PerlFixupHandler that looks like this: package My::Fixup; use strict; use warnings; use utf8; use Apache2::Const qw(OK NOT_FOUND); sub handler { my $r = shift; return NOT_FOUND if $r->path_info; return OK; } 1; It worked just great! Phillip You know, I am slowly getting the feeling that by dicing and slicing the URLs and fixing up things afterward, you are setting yourself up for some major headeaches later on, when something changes and/or someone needs to figure out what is going on. It may be time to reconsider the issue "top-down", and maybe see if there is not a "more standard" way of achieving what you wanted to achieve in the first place (which I honestly have lost track of by now). I mean, mod_perl is great, in that it allows one to do just that kind of thing, at a relatively deep level within the Apache logic itself. But there lies also the danger of incrementally building up your very own webserver with its very own logic, which at some point does not match anymore what a HTTP-compliant webserver should do in some particular circumstance, and becomes very fragile and difficult to maintain or adapt to some new quirk which would appear on the WWW. For example, I believe that it is entirely HTTP-compliant for a URL to invoke a script and nevertheless pass it path-info information at the same time, for the script to use in some way. In this particular case, should it not be the script which discards the unwanted path_info if it doesn't want it ? I am not saying that this is the "right" answer, but it is maybe worth pondering, before introducing additional webserver logic which might have unintended side-effects in other cases. André
Re: Disallow path info
Good news. I got a helpful tip from a Dr James Smith to use a PerlFixupHandler that looks like this: package My::Fixup; use strict; use warnings; use utf8; use Apache2::Const qw(OK NOT_FOUND); sub handler { my $r = shift; return NOT_FOUND if $r->path_info; return OK; } 1; It worked just great! Phillip
Re: Disallow path info
On Mon, Mar 23, 2015 at 8:05 PM, Lathan Bidwell wrote: > Out of curiosity, Are there links that actually point to > /myscript.pl/path/info/... ? Nope, it was just something I accidentally stumbled onto while testing my site; so there's no concern about breaking any links. Phillip
Re: Disallow path info
Out of curiosity, Are there links that actually point to / myscript.pl/path/info/... ? Because if you are trying to block them, then it sounds like you don't want to link to them either. Would it be possible to find how they are reaching that page and change the links? Another perspective: If you change the links, then if they somehow get there...then they get a broken page On Mon, Mar 23, 2015 at 6:16 PM, Phillip Hellewell wrote: > On Mon, Mar 23, 2015 at 3:08 PM, Randolf Richardson > wrote: > > Can you provide some additional detail about what you're doing? > > I'm just trying to secure my website, and one problem right now is if > someone enters http://mysite.example.com/myscript.pl/path/info , not > only does it work (which I don't want), but the page is formatted all > wrong because it can't find the css, because my HTML uses a relative > path, like this: > > > > and because of the path of the URL, the web browser looks here for it: > http://mysite.example.com/myscript.pl/path/info/css/default.css > instead of here where it's actually at: > http://mysite.example.com/css/default.css > > And worst of all, that really long crazy path also works, and returns > HTML from my script, and it is not CSS. > > This affects some js files that I include from my HTML too. What's > scary is the fact that this means the browser tries to interpret the > HTML that gets returned as both css and javascript. > > And yes, I already know I can make this work probably by using > absolute paths for (but actually I can't, because in my real use case > there is a parent folder in the path that comes before the script, on > some servers but not others). > > But I don't want to do a bunch of workarounds to "make it work". I > want the user to get a 404. > > > As far as I understand it, with "AcceptPathInfo Off" in efffect, > the > > "/path/info" portion should cause a 404 error. Of course, if this > > doesn't work on your system, one possible work-around would be to > > Yes, and it does work on my system when using the default cgi handler. > It only does not work when I am using ModPerl handler. > > > check that $r->path_info is empty and do the following if it isn't: > > > > $r->status(Apache2::Const::HTTP_NOT_FOUND); > > return Apache2::Const::HTTP_NOT_FOUND; > > What is $r and where does it come from? I know I can check the > PATH_INFO env var from my perl script, but my goal is to *not* have to > modify all my .pl scripts to do extra checking. > > Thanks, > Phillip >
Re: Disallow path info
On Mon, Mar 23, 2015 at 3:08 PM, Randolf Richardson wrote: > Can you provide some additional detail about what you're doing? I'm just trying to secure my website, and one problem right now is if someone enters http://mysite.example.com/myscript.pl/path/info , not only does it work (which I don't want), but the page is formatted all wrong because it can't find the css, because my HTML uses a relative path, like this: and because of the path of the URL, the web browser looks here for it: http://mysite.example.com/myscript.pl/path/info/css/default.css instead of here where it's actually at: http://mysite.example.com/css/default.css And worst of all, that really long crazy path also works, and returns HTML from my script, and it is not CSS. This affects some js files that I include from my HTML too. What's scary is the fact that this means the browser tries to interpret the HTML that gets returned as both css and javascript. And yes, I already know I can make this work probably by using absolute paths for (but actually I can't, because in my real use case there is a parent folder in the path that comes before the script, on some servers but not others). But I don't want to do a bunch of workarounds to "make it work". I want the user to get a 404. > As far as I understand it, with "AcceptPathInfo Off" in efffect, the > "/path/info" portion should cause a 404 error. Of course, if this > doesn't work on your system, one possible work-around would be to Yes, and it does work on my system when using the default cgi handler. It only does not work when I am using ModPerl handler. > check that $r->path_info is empty and do the following if it isn't: > > $r->status(Apache2::Const::HTTP_NOT_FOUND); > return Apache2::Const::HTTP_NOT_FOUND; What is $r and where does it come from? I know I can check the PATH_INFO env var from my perl script, but my goal is to *not* have to modify all my .pl scripts to do extra checking. Thanks, Phillip
Re: Disallow path info
> Hello, > > I would like to disallow path info, i.e., respond with 404 if > PATH_INFO is not empty, i.e., if the URL is something like > http://mysite.example.com/myscript.pl/path/info. > > I tried the Apache directive "AcceptPathInfo Off", but sadly this only > works with the normal cgi handler; ModPerl seems to ignore it. > > Thanks, > Phillip Can you provide some additional detail about what you're doing? As far as I understand it, with "AcceptPathInfo Off" in efffect, the "/path/info" portion should cause a 404 error. Of course, if this doesn't work on your system, one possible work-around would be to check that $r->path_info is empty and do the following if it isn't: $r->status(Apache2::Const::HTTP_NOT_FOUND); return Apache2::Const::HTTP_NOT_FOUND; Randolf Richardson - rand...@inter-corporate.com Inter-Corporate Computer & Network Services, Inc. Beautiful British Columbia, Canada http://www.inter-corporate.com/
Disallow path info
Hello, I would like to disallow path info, i.e., respond with 404 if PATH_INFO is not empty, i.e., if the URL is something like http://mysite.example.com/myscript.pl/path/info. I tried the Apache directive "AcceptPathInfo Off", but sadly this only works with the normal cgi handler; ModPerl seems to ignore it. Thanks, Phillip