Re: [Repoze-dev] repoze.vhm and repoze.zope2 odd behavior
Thanks for the feedback on this. I've set up some tests that I think illustrate what is going on pretty well. What ends up happening in the current situation is that a url of /a/b/c/d get chopped off to something like /b/c/d NOT including the virtual root. Sorry, where is this happening? See below with the test results. I hate to go through it like this. I don't have svn access though and either I'm completely misunderstanding something or this is a rather ugly bug. We can get you svn access for sure. :) How can I go about getting that? It is annoying. My suggestion would be: - Try the same setup with a standard Zope and vhosting via the old VirtualHostMonster, just to have a baseline. Write a simple view that prints the request (in particular the keys VIRTUAL_URL, PATH_INFO, ACTUAL_URL and URL) - Use the same view in a setup that uses repoze.vhm#vhm_path and a rewrite rule. Are you getting the same behaviour? - Use the same view in a setup that uses repoze.vhm#vhm_xheaders and set headers. Are you getting the same behaviour? Maybe the first setup would take some time to set up, but using a rewrite rule + vhm_path should be a minor change from using custom headers and vhm_xheaders. At least then we can find out where, if anywhere, there are differences. Then we can dig deeper. Here are the test results.. I've included the apache configuration in there just so you can see what I'm up to. Results === all requests coming on http://example.com:/a/b/c/d/@@testing Not using wsgi and virtual host monster --- VirtualHost *: ServerName example.com ServerAlias example.com RewriteEngine On RewriteRule ^/(.*) http://127.0.0.1:8499/VirtualHostBase/http/example.com:/example/VirtualHostRoot/$1[L,P] /VirtualHost VIRTUAL_URL = http://example.com:/a/b/c/d/@@testing PATH_INFO = /VirtualHostBase/http/ example.com:/example/VirtualHostRoot/a/b/c/d/@@testing ACTUAL_URL = http://example.com:/a/b/c/d/@@testing URL = http://example.com:/a/b/c/d/@@testing SCRIPT_NAME = repoze.vhm.virtual_root = repoze.vhm.virtual_url = repoze.vhm.virtual_host_base = HTTP_X_VHM_HOST = HTTP_X_VHM_ROOT = Using wsgi with vhm_path VIRTUAL_URL = http://example.com:/a/b/c/d/@@testing PATH_INFO = /example/a/b/c/d/@@testing ACTUAL_URL = http://example.com:/a/b/c/d/@@testing URL = http://example.com:/a/b/c/d/@@testing SCRIPT_NAME = repoze.vhm.virtual_root = /example repoze.vhm.virtual_url = http://example.com:/a/b/c/d/@@testing repoze.vhm.virtual_host_base = example.com: HTTP_X_VHM_HOST = HTTP_X_VHM_ROOT = Using wsgi with vhm_xheaders VirtualHost *: ServerName example.com ServerAlias example.com RewriteEngine On RewriteRule ^/(.*) http://127.0.0.1:8499/$1 [P,L] RequestHeader add X-Vhm-Host http://example.com: RequestHeader add X-Vhm-Root /example /VirtualHost VIRTUAL_URL = http://example.com:/b/c/d/@@testing PATH_INFO = /a/b/c/d/@@testing ACTUAL_URL = http://example.com:/b/c/d/@@testing URL = http://example.com:/a/b/c/d/@@testing SCRIPT_NAME = repoze.vhm.virtual_root = /example repoze.vhm.virtual_url = http://example.com:/b/c/d/@@testing repoze.vhm.virtual_host_base = example.com: HTTP_X_VHM_HOST = http://example.com: HTTP_X_VHM_ROOT = /example It does seem like we have a problem here as its chopping off the first part of the path with the xheaders setup. I'm surprised that the site works as well as it does with this kind of issue. Thanks again, Nathan On Tue, Dec 29, 2009 at 5:04 AM, Martin Aspeli optilude+li...@gmail.comoptilude%2bli...@gmail.com wrote: Hi Nathan, So it seems that VIRTUAL_URL and ACTUAL_URL end up being exactly the same in this setup. Yes, that's probably to be expected. ACTUAL_URL is basically the URL the user sees. In a path based VHM setup, that's http://example.com when your internal URL is http://example.com/VirtualHostBase/http/localhost:80/Plone/VirtualHostRoot In a header-based VHM setup, you don't have path munging, so if Apache is hosted on http://example.com, ACTUAL_URL is too. VIRTUAL_URL is the URL after VHM translation, so in most cases VIRTUAL_URL and ACTUAL_URL will be the same. With vhm_xheaders, they're basically set to be exactly the same, I think, as in environ['ACTUAL_URL'] = environ['VIRTUAL_URL'], though I could be wrong about that. Also, I'm looking at the tests in the repoze.vhm package, and it seems the only reason why the tests pass is because they are using an incorrect PATH_INFO. I'll try and setup an example and show what I mean, vhm host = http://example.com/ vhm root = /example request = http://example.com/a/b/c/d PATH_INFO should be /a/b/c/d right? No, I don't think so - see below. The way
Re: [Repoze-dev] repoze.vhm and repoze.zope2 odd behavior
Maybe I should have explained the tests better then. All the requests are for the url http://example.com:/a/b/c/d/@@testing and the plone site as located at /example from the zope root. OK, to analyze the traditional Apache + Zope setup: you are asking for requests to 'http://example.com/*; to be rewritten onto the Zope server at port , with the '/example' folder serving as the virtual root. URLs generated by Zope should be relative to that root. So, the absolute_url for tha object whose physical path is '/example/foo' should be 'http://example.com/foo'. Correct? Yes Using wsgi with vhm_path Same rewrite rule as before (i.e., proxying onto a paste server)? Yes Using wsgi with vhm_xheaders VirtualHost *: ServerName example.com ServerAlias example.com RewriteEngine On RewriteRule ^/(.*) http://127.0.0.1:8499/$1 [P,L] RequestHeader add X-Vhm-Host http://example.com: RequestHeader add X-Vhm-Root /example /VirtualHost You aren't rewriting onto '/example' here, so why are you setting it as the virtual root? That's how the xheader filter works. The simple example from the docs shows ServerName www.example.com RewriteEngine on RewriteRule ^/(.*) http://localhost:8080/$1 [P,L] RequestHeader add X-Vhm-Host http://www.example.com RequestHeader add X-Vhm-Root /cms So I am doing it correctly. VIRTUAL_URL = http://example.com:/b/c/d/@@testing PATH_INFO = /a/b/c/d/@@testing ACTUAL_URL = http://example.com:/b/c/d/@@testing URL = http://example.com:/a/b/c/d/@@testing SCRIPT_NAME = repoze.vhm.virtual_root = /example repoze.vhm.virtual_url = http://example.com:/b/c/d/@@testing repoze.vhm.virtual_host_base = example.com: HTTP_X_VHM_HOST = http://example.com: HTTP_X_VHM_ROOT = /example It does seem like we have a problem here as its chopping off the first part of the path with the xheaders setup. I'm surprised that the site works as well as it does with this kind of issue. I think your configuration is incorrect: the 'X-Vhm-Root' header is supposed to signal the phanotom path prefix in the URL apparent to the app. It is. /example is the actual root of zope to prepend every request with. The site is still serves the correct objects, just some things don't work completely right. I still think my configuration is correct. If not, how should it look? Thanks, Nathan On Wed, Dec 30, 2009 at 12:03 AM, Tres Seaver tsea...@palladion.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Nathan Van Gheem wrote: Thanks for the feedback on this. I've set up some tests that I think illustrate what is going on pretty well. What ends up happening in the current situation is that a url of /a/b/c/d get chopped off to something like /b/c/d NOT including the virtual root. Sorry, where is this happening? See below with the test results. I hate to go through it like this. I don't have svn access though and either I'm completely misunderstanding something or this is a rather ugly bug. We can get you svn access for sure. :) How can I go about getting that? It is annoying. My suggestion would be: - Try the same setup with a standard Zope and vhosting via the old VirtualHostMonster, just to have a baseline. Write a simple view that prints the request (in particular the keys VIRTUAL_URL, PATH_INFO, ACTUAL_URL and URL) - Use the same view in a setup that uses repoze.vhm#vhm_path and a rewrite rule. Are you getting the same behaviour? - Use the same view in a setup that uses repoze.vhm#vhm_xheaders and set headers. Are you getting the same behaviour? Maybe the first setup would take some time to set up, but using a rewrite rule + vhm_path should be a minor change from using custom headers and vhm_xheaders. At least then we can find out where, if anywhere, there are differences. Then we can dig deeper. Here are the test results.. I've included the apache configuration in there just so you can see what I'm up to. Results === all requests coming on http://example.com:/a/b/c/d/@@testing Not using wsgi and virtual host monster --- VirtualHost *: ServerName example.com ServerAlias example.com RewriteEngine On RewriteRule ^/(.*) http://127.0.0.1:8499/VirtualHostBase/http/example.com:/example/VirtualHostRoot/$1[L,P] /VirtualHost OK, to analyze the traditional Apache + Zope setup: you are asking for requests to 'http://example.com/*; to be rewritten onto the Zope server at port , with the '/example' folder serving as the
Re: [Repoze-dev] repoze.vhm and repoze.zope2 odd behavior
Nathan Van Gheem wrote: Thanks for the feedback on this. I've set up some tests that I think illustrate what is going on pretty well. We can get you svn access for sure. :) How can I go about getting that? Email Chris McDonough. At least that's what I did. ;) Here are the test results.. I've included the apache configuration in there just so you can see what I'm up to. Results === all requests coming on http://example.com:/a/b/c/d/@@testing Not using wsgi and virtual host monster --- VirtualHost *: ServerName example.com http://example.com ServerAlias example.com http://example.com RewriteEngine On RewriteRule ^/(.*) http://127.0.0.1:8499/VirtualHostBase/http/example.com:/example/VirtualHostRoot/$1 [L,P] /VirtualHost VIRTUAL_URL = http://example.com:/a/b/c/d/@@testing PATH_INFO = /VirtualHostBase/http/example.com:/example/VirtualHostRoot/a/b/c/d/@@testing http://example.com:/example/VirtualHostRoot/a/b/c/d/@@testing ACTUAL_URL = http://example.com:/a/b/c/d/@@testing URL = http://example.com:/a/b/c/d/@@testing SCRIPT_NAME = repoze.vhm.virtual_root = repoze.vhm.virtual_url = repoze.vhm.virtual_host_base = HTTP_X_VHM_HOST = HTTP_X_VHM_ROOT = Using wsgi with vhm_path VIRTUAL_URL = http://example.com:/a/b/c/d/@@testing PATH_INFO = /example/a/b/c/d/@@testing ACTUAL_URL = http://example.com:/a/b/c/d/@@testing URL = http://example.com:/a/b/c/d/@@testing SCRIPT_NAME = repoze.vhm.virtual_root = /example repoze.vhm.virtual_url = http://example.com:/a/b/c/d/@@testing repoze.vhm.virtual_host_base = example.com: http://example.com: HTTP_X_VHM_HOST = HTTP_X_VHM_ROOT = So - this looks identical to the vanilla example (VIRTUAL_URL, ACTUAL_URL, URL). Does this configuration work as expected? Using wsgi with vhm_xheaders VirtualHost *: ServerName example.com http://example.com ServerAlias example.com http://example.com RewriteEngine On RewriteRule ^/(.*) http://127.0.0.1:8499/$1 [P,L] RequestHeader add X-Vhm-Host http://example.com: RequestHeader add X-Vhm-Root /example /VirtualHost VIRTUAL_URL = http://example.com:/b/c/d/@@testing PATH_INFO = /a/b/c/d/@@testing ACTUAL_URL = http://example.com:/b/c/d/@@testing URL = http://example.com:/a/b/c/d/@@testing SCRIPT_NAME = repoze.vhm.virtual_root = /example repoze.vhm.virtual_url = http://example.com:/b/c/d/@@testing repoze.vhm.virtual_host_base = example.com: http://example.com: HTTP_X_VHM_HOST = http://example.com: HTTP_X_VHM_ROOT = /example It does seem like we have a problem here as its chopping off the first part of the path with the xheaders setup. I'm surprised that the site works as well as it does with this kind of issue. That suggests that the number of elements chopped up is the same as the number of elements in X-VHM-Root. So, there is probably a bug in this scenario when: - repoze.vhm.virtual_url/VIRTUAL_URL/ACTUAL_URL is being set, and: - we have a repoze.vhm.virtual_root I'd try to construct a test case for repoze.vhm that illustrates this. I think you were right that the problem is here: virtual_url_parts += real_path[len(vroot_path):] vroot_path is a list of the elements in the path given by X-VHM-Root, so that's where it's chopping off. real_path is a list of the elements in PATH_INFO. This still looks fishy, though. PATH_INFO, as far as I recall, needs to contain the full path, from the Zope root. This is what repoze.zope2's z2bob is using to traverse (I think...). That code clearly assumes this is the case, and so lops off the prefix which should be hidden to the user. In your case, PATH_INFO is /a/b/c/d/@@testing. I would've expected it to be /example/a/b/c/d/@@testing, as it is in the first two examples. Perhaps repoze.zope2 has special handling for the case where repoze.vhm.virtual_root is set and prefixes that to PATH_INFO when it decides what to traverse to? That seems wrong But then, it's set in the vhm_path case too. It is possible that z2bob is doing that prefixing, and we're seeing a traversal like /example/example/a/b/c/d/@@testing, which would work because of acquisition, possibly. Martin -- Author of `Professional Plone Development`, a book for developers who want to work with Plone. See http://martinaspeli.net/plone-book ___ Repoze-dev mailing list Repoze-dev@lists.repoze.org http://lists.repoze.org/listinfo/repoze-dev
Re: [Repoze-dev] repoze.vhm and repoze.zope2 odd behavior
So - this looks identical to the vanilla example (VIRTUAL_URL, ACTUAL_URL, URL). Does this configuration work as expected? Yes. In your case, PATH_INFO is /a/b/c/d/@@testing. I would've expected it to be /example/a/b/c/d/@@testing, as it is in the first two examples. exactly perhaps repoze.zope2 has special handling for the case where repoze.vhm.virtual_root is set and prefixes that to PATH_INFO when it decides what to traverse to? That seems wrong But then, it's set in the vhm_path case too. It is possible that z2bob is doing that prefixing, and we're seeing a traversal like /example/example/a/b/c/d/@@testing, which would work because of acquisition, possibly. I assumed it was just acquisition trickery. Looking more closely, z2bob.py uses the repoze.vhm.virtual_root via the getVirtualRoot in the repoze.vhm.utils package to get the virtual root and it seems to create the correct path for traversal regardless of the issue. Just some small things are off, like content actions and I just noticed that related items don't show also. I'd try to construct a test case for repoze.vhm that illustrates this. I think you were right that the problem is here: virtual_url_parts += real_path[len(vroot_path):] This is kind of why I said earlier that the tests weren't actually testing it correctly. We should already be testing for this, it's just that the existing tests expect wrong results. The test assumes the PATH_INFO does NOT change after the filter application which is wrong. If PATH_INFO is how it decides on what object to traverse to, then we should be explicitly setting it in the xheader filter like is done in the xpath filter. For instance, the only change needed is that the filter should look like this, Index: repoze/vhm/middleware.py === --- repoze/vhm/middleware.py (revision 7747) +++ repoze/vhm/middleware.py (working copy) @@ -85,6 +85,7 @@ def __call__(self, environ, start_response): host_header = environ.get('HTTP_X_VHM_HOST') root_header = environ.get('HTTP_X_VHM_ROOT') +environ['PATH_INFO'] = root_header + environ['PATH_INFO'] munge(environ, host_header, root_header) return self.application(environ, start_response) along with the tests being fixed. I'm sorry if I made this a bigger deal than it really is--seems like a simple fix. I've been strapped for time and I'm just trying to do what I can... Let me know what you think. Thanks, Nathan On Wed, Dec 30, 2009 at 7:06 AM, Martin Aspeli optilude+li...@gmail.comoptilude%2bli...@gmail.com wrote: Nathan Van Gheem wrote: Thanks for the feedback on this. I've set up some tests that I think illustrate what is going on pretty well. We can get you svn access for sure. :) How can I go about getting that? Email Chris McDonough. At least that's what I did. ;) Here are the test results.. I've included the apache configuration in there just so you can see what I'm up to. Results === all requests coming on http://example.com:/a/b/c/d/@@testing Not using wsgi and virtual host monster --- VirtualHost *: ServerName example.com http://example.com ServerAlias example.com http://example.com RewriteEngine On RewriteRule ^/(.*) http://127.0.0.1:8499/VirtualHostBase/http/example.com:/example/VirtualHostRoot/$1 [L,P] /VirtualHost VIRTUAL_URL = http://example.com:/a/b/c/d/@@testing PATH_INFO = /VirtualHostBase/http/ example.com:/example/VirtualHostRoot/a/b/c/d/@@testing http://example.com:/example/VirtualHostRoot/a/b/c/d/@@testing ACTUAL_URL = http://example.com:/a/b/c/d/@@testing URL = http://example.com:/a/b/c/d/@@testing SCRIPT_NAME = repoze.vhm.virtual_root = repoze.vhm.virtual_url = repoze.vhm.virtual_host_base = HTTP_X_VHM_HOST = HTTP_X_VHM_ROOT = Using wsgi with vhm_path VIRTUAL_URL = http://example.com:/a/b/c/d/@@testing PATH_INFO = /example/a/b/c/d/@@testing ACTUAL_URL = http://example.com:/a/b/c/d/@@testing URL = http://example.com:/a/b/c/d/@@testing SCRIPT_NAME = repoze.vhm.virtual_root = /example repoze.vhm.virtual_url = http://example.com:/a/b/c/d/@@testing repoze.vhm.virtual_host_base = example.com: http://example.com: HTTP_X_VHM_HOST = HTTP_X_VHM_ROOT = So - this looks identical to the vanilla example (VIRTUAL_URL, ACTUAL_URL, URL). Does this configuration work as expected? Using wsgi with vhm_xheaders VirtualHost *: ServerName example.com http://example.com
Re: [Repoze-dev] repoze.vhm and repoze.zope2 odd behavior
Nathan Van Gheem wrote: So - this looks identical to the vanilla example (VIRTUAL_URL, ACTUAL_URL, URL). Does this configuration work as expected? Yes. As an aside, if you're using RewriteEngine anyway, you might as well use repoze.vhm#vhm_path. The X-header stuff is mostly useful if you're using mod_wsgi. I'm glad we're trying to fix it, though. :) perhaps repoze.zope2 has special handling for the case where repoze.vhm.virtual_root is set and prefixes that to PATH_INFO when it decides what to traverse to? That seems wrong But then, it's set in the vhm_path case too. It is possible that z2bob is doing that prefixing, and we're seeing a traversal like /example/example/a/b/c/d/@@testing, which would work because of acquisition, possibly. I assumed it was just acquisition trickery. Looking more closely, z2bob.py uses the repoze.vhm.virtual_root via the getVirtualRoot in the repoze.vhm.utils package to get the virtual root and it seems to create the correct path for traversal regardless of the issue. Just some small things are off, like content actions and I just noticed that related items don't show also. Well, if ACTUAL_URL is wrong, then that's a serious bug. I'd try to construct a test case for repoze.vhm that illustrates this. I think you were right that the problem is here: virtual_url_parts += real_path[len(vroot_path):] This is kind of why I said earlier that the tests weren't actually testing it correctly. We should already be testing for this, it's just that the existing tests expect wrong results. The test assumes the PATH_INFO does NOT change after the filter application which is wrong. Right. It happens. :) We should fix those tests. I remember fixing other such wrong tests before, so I'm not wholly surprised. If PATH_INFO is how it decides on what object to traverse to, then we should be explicitly setting it in the xheader filter like is done in the xpath filter. I think that'd be safer, but you may want to ask Chris what he thinks as well. Looking at how plain Zope 2 behaves, PATH_INFO is meant to contain the full path that Zope is traversing to, virtual hosting or not. So maybe, by setting PATH_INFO correctly, we can avoid the interplay between repoze.zope2 and repoze.vhm entirely, or at least reduce it. I guess setting ACTUAL_URL is still a repoze.zope2 responsibility, but beyond that, trying to get bits of information from repoze.vhm creates a slightly weird dependency across two packages. For instance, the only change needed is that the filter should look like this, Index: repoze/vhm/middleware.py === --- repoze/vhm/middleware.py (revision 7747) +++ repoze/vhm/middleware.py (working copy) @@ -85,6 +85,7 @@ def __call__(self, environ, start_response): host_header = environ.get('HTTP_X_VHM_HOST') root_header = environ.get('HTTP_X_VHM_ROOT') +environ['PATH_INFO'] = root_header + environ['PATH_INFO'] munge(environ, host_header, root_header) return self.application(environ, start_response) along with the tests being fixed. I'm sorry if I made this a bigger deal than it really is--seems like a simple fix. I've been strapped for time and I'm just trying to do what I can... Let me know what you think. This stuff is both complicated and time-consuming to test, so I'm glad you've stuck with it. I'd say that if you are able to test both configurations with a real Zope 2 as well as fix the tests, that'd be ideal. I would drop Chris a line if he's not reading this and just ask if he has an opinion. I suspect he'll say I hate Zope 2 and leave you to it. In which case, ask for commit access, too. ;) Martin -- Author of `Professional Plone Development`, a book for developers who want to work with Plone. See http://martinaspeli.net/plone-book ___ Repoze-dev mailing list Repoze-dev@lists.repoze.org http://lists.repoze.org/listinfo/repoze-dev
Re: [Repoze-dev] repoze.vhm and repoze.zope2 odd behavior
Martin Aspeli wrote: I assumed it was just acquisition trickery. Looking more closely, z2bob.py uses the repoze.vhm.virtual_root via the getVirtualRoot in the repoze.vhm.utils package to get the virtual root and it seems to create the correct path for traversal regardless of the issue. Just some small things are off, like content actions and I just noticed that related items don't show also. Well, if ACTUAL_URL is wrong, then that's a serious bug. TBH, I never really understood what ACTUAL_URL was supposed to do. Limi added it at some point, but I'm pretty sure he never really understood what it was supposed to do either. ;-) Or at least he'll deny knowing anything about it. If PATH_INFO is how it decides on what object to traverse to, then we should be explicitly setting it in the xheader filter like is done in the xpath filter. I think that'd be safer, but you may want to ask Chris what he thinks as well. Looking at how plain Zope 2 behaves, PATH_INFO is meant to contain the full path that Zope is traversing to, virtual hosting or not. So maybe, by setting PATH_INFO correctly, we can avoid the interplay between repoze.zope2 and repoze.vhm entirely, or at least reduce it. I guess setting ACTUAL_URL is still a repoze.zope2 responsibility, but beyond that, trying to get bits of information from repoze.vhm creates a slightly weird dependency across two packages. I'm going to defer to the professionals here; I don't use this package much anymore, so you'll need to use some judgment and check in a fix that is more correct I think. - C ___ Repoze-dev mailing list Repoze-dev@lists.repoze.org http://lists.repoze.org/listinfo/repoze-dev
Re: [Repoze-dev] repoze.vhm and repoze.zope2 odd behavior
Chris McDonough wrote: Martin Aspeli wrote: I assumed it was just acquisition trickery. Looking more closely, z2bob.py uses the repoze.vhm.virtual_root via the getVirtualRoot in the repoze.vhm.utils package to get the virtual root and it seems to create the correct path for traversal regardless of the issue. Just some small things are off, like content actions and I just noticed that related items don't show also. Well, if ACTUAL_URL is wrong, then that's a serious bug. TBH, I never really understood what ACTUAL_URL was supposed to do. Limi added it at some point, but I'm pretty sure he never really understood what it was supposed to do either. ;-) Or at least he'll deny knowing anything about it. Ah, history... Limi *wanted* ACTUAL_URL to be what the user sees, so basically the virtual or real URL, *and* the query string, if any. What he got was something which is basically the same as VIRTUAL_URL if that's set, or URL if not. It's still useful, though, since VIRTUAL_URL is not set at all if you're not using VHM (in standard Zope). So request['ACTUAL_URL'] is a safe way to get the real URL people see. And to get what Alex wanted, you'd need request['ACTUAL_URL'] + '?' + request['QUERY_STRING']. I'm going to defer to the professionals here; I don't use this package much anymore, so you'll need to use some judgment and check in a fix that is more correct I think. I figured. :) I think Nathan's on the right path, though we just need to do some sanity checks with a live running site to be sure. Martin -- Author of `Professional Plone Development`, a book for developers who want to work with Plone. See http://martinaspeli.net/plone-book ___ Repoze-dev mailing list Repoze-dev@lists.repoze.org http://lists.repoze.org/listinfo/repoze-dev